<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>LLM on Vladimir Samoylov</title><link>https://cageyv.dev/series/llm/</link><description>Recent content in LLM on Vladimir Samoylov</description><generator>Hugo</generator><language>en</language><lastBuildDate>Fri, 06 Feb 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://cageyv.dev/series/llm/index.xml" rel="self" type="application/rss+xml"/><item><title>Experiment: Self-hosting an LLM on AWS Inf2/Trn3</title><link>https://cageyv.dev/posts/inf2-trn3-vllm-neuron-experiment/</link><pubDate>Fri, 06 Feb 2026 00:00:00 +0000</pubDate><guid>https://cageyv.dev/posts/inf2-trn3-vllm-neuron-experiment/</guid><description>&lt;p>I wanted a simple, repeatable way to answer a practical question:&lt;/p>
&lt;blockquote>
&lt;p>“If we had to run a private model on AWS, what’s the operational reality on Inf2 / Trn3?”&lt;/p>&lt;/blockquote>
&lt;p>This post documents one small &lt;strong>hands-on&lt;/strong> experiment (Inf2) plus the exact follow-up checks I’d run on &lt;strong>Trn3&lt;/strong>.&lt;/p>
&lt;h2 id="experiment-setup-inf2">
 Experiment setup (Inf2)
 &lt;a class="heading-link" href="#experiment-setup-inf2">
 &lt;i class="fa-solid fa-link" aria-hidden="true" title="Link to heading">&lt;/i>
 &lt;span class="sr-only">Link to heading&lt;/span>
 &lt;/a>
&lt;/h2>
&lt;p>This was a quick “can we get it working end-to-end?” test, not a full benchmark suite.&lt;/p></description></item></channel></rss>