AI Infrastructure
From snowmelt
to silicon.
The first institutional GPU cluster in Nepal. H100, A100, L40S — fed by hydropower running off the Himalayan watershed and solar on the Lumbini foothills, paired with the trained annotation team in the same building.
RTX 6000
Lumbini foothills
Himalayan watershed
standard workloads
The cluster
Hardware on the floor.
Purpose-built for training, fine-tuning, inference and research. Available wholesale on a per-hour or monthly reserved basis. Engineering support included; sensitive workloads run on isolated clusters.
| GPU | VRAM | Best for | From |
|---|---|---|---|
| H100 SXM | 80 GB HBM3 | Frontier model training, large-context inference, RLHF runs at scale | $3.20/ hr |
| A100 SXM | 80 GB HBM2e | Mid-scale training, fine-tuning, production inference | $1.80/ hr |
| L40S | 48 GB GDDR6 | Inference, multimodal serving, fine-tuning under 30B params | $1.10/ hr |
| RTX 6000 Ada | 48 GB GDDR6 | Workstation-class research, vision, fine-tuning, lower-volume inference | $0.80/ hr |
Reserved monthly rates from −22% · Volume from 100k unit-hours / month negotiated separately
Training
Training and fine-tuning, managed end to end.
Bring a base model and a dataset, leave with weights and a deployment. We handle orchestration, checkpointing, evaluation and the messy parts in between. Engineering support on our side; reproducible runs on yours.
LoRA
Lightweight adaptation
Domain-tuned LoRAs and QLoRAs. Fast iteration cycles, small artefacts, low cost.
Full fine-tune
End-to-end weight updates
Full-parameter fine-tunes for production-grade specialised models. DDP and FSDP supported.
RLHF / DPO
Preference optimisation
Reward model training and policy optimisation, paired with the in-house annotation pipeline if needed.
Quantisation
Export and deploy
GGUF, AWQ, and INT8 quantisation for edge or low-cost inference. Calibration data included.
REST / gRPC
Hosted endpoints
Deploy a model behind an HTTPS endpoint with auto-scaling, load balancing and request logging.
Latency
<200ms standard
Sub-200ms response on standard text and vision workloads. SLAs on reserved capacity.
Wholesale APIs
White-label inference
Annotation, content, QA and compute APIs available white-label. Volume pricing, custom SLAs.
Isolation
Dedicated clusters
Sensitive workloads run on dedicated nodes with full isolation. Sovereign deployment available.
Inference
Serving infrastructure that holds.
Hosted endpoints, white-label APIs, dedicated clusters for sensitive workloads. The hard parts — autoscaling, observability, GPU scheduling — are ours.
Wholesale
Compute that pays for itself.
H100 / hr
$3.20↑
From · per hour
Frontier-class training and large-context inference. Reserved monthly rates from −22%.
A100 / hr
$1.80↑
From · per hour
Workhorse for fine-tuning and production inference. Most common starting point.
L40S / hr
$1.10↑
From · per hour
Inference and multimodal serving with strong cost-per-token economics.
Why this geography
The cheapest clean compute on the planet.
Nepal sits on more than 42,000 MW of viable hydropower running off the Himalayan watershed. The Lumbini foothills receive some of the highest solar irradiance in Asia. We're tying both directly into compute capacity — no diesel hedge, no carbon offsets, no marketing line.
Sovereign positioning helps too: Nepal sits between China and India with access to both and obligations to neither. For partners who care about jurisdictional independence, that matters.
Hydropower
42,000 MW
Viable hydropower in the Himalayan watershed. Stable base-load for compute.
Solar
100 MW contracted
Lumbini foothills receive some of the highest irradiance in Asia. Powering compute directly.
Sovereignty
Between two giants
Access to China and India, obligations to neither. Useful for jurisdictionally-sensitive workloads.
Operators
Annotation in the same building
Train, fine-tune, RLHF, and human-feedback loops handled end-to-end without a vendor handoff.
Get on the cluster
Reserve a slot. Run a pilot. Scale from there.
Whether you need one H100 for an evening or 100 nodes reserved for a quarter — start with an email, get a response inside a business day.
hello@himalayansiliconvalley.com · Lumbini · Kathmandu
