<?xml version="1.0" encoding="utf-8" standalone="yes" ?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Blog Posts | Thai-Hoang Nguyen</title>
    <link>https://thaihoang.org/blog/</link>
      <atom:link href="https://thaihoang.org/blog/index.xml" rel="self" type="application/rss+xml" />
    <description>Blog Posts</description>
    <generator>Wowchemy (https://wowchemy.com)</generator><language>en-us</language><lastBuildDate>Sat, 21 Sep 2024 00:00:00 +0000</lastBuildDate>
    <image>
      <url>https://thaihoang.org/media/logo_hu0b9dfbb3ba87f50d3e34442b83da7c44_14671_300x300_fit_lanczos_3.png</url>
      <title>Blog Posts</title>
      <link>https://thaihoang.org/blog/</link>
    </image>
    
    <item>
      <title>How to run LLM model locally using only Macbook Air M1/M2</title>
      <link>https://thaihoang.org/blog/llm-macbook/</link>
      <pubDate>Sat, 21 Sep 2024 00:00:00 +0000</pubDate>
      <guid>https://thaihoang.org/blog/llm-macbook/</guid>
      <description>&lt;style type=&#34;text/css&#34;&gt;

pre {
  max-height: 350px;
  overflow-y: auto;
}

pre[class] {
  max-height: 350px;
}
::-webkit-scrollbar {
  -webkit-appearance: none;
  width: 7px;
  scrollbar-color: red yellow;
}

::-webkit-scrollbar-thumb {
  border-radius: 4px;
  background-color: rgba(0, 0, 0, .5);
  box-shadow: 0 0 1px rgba(255, 255, 255, .5);
  scrollbar-color: red yellow;
}

.scrollable-element {

}

&lt;/style&gt;
&lt;p&gt;In the ever-evolving landscape of artificial intelligence, the ability to run
large language models (LLMs) locally has become a game-changer for developers,
enthusiasts, and professionals alike. Imagine having the prowess of models like
GPT-3 right on your MacBook Air M2—fast, secure, and entirely under your
control. Enter &lt;code&gt;Ollama&lt;/code&gt;, a tool that makes this a reality. In this blog post,
I&amp;rsquo;ll walk you through the seamless process of setting up an LLM on your
MacBook Air M2 using Ollama.&lt;/p&gt;
&lt;h1 id=&#34;why-run-an-llm-locally&#34;&gt;Why Run an LLM Locally?&lt;/h1&gt;
&lt;p&gt;Before diving into the setup, let&amp;rsquo;s explore why running an LLM locally is beneficial:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Privacy and Security&lt;/strong&gt;: Your data stays on your device, eliminating concerns about data breaches or unauthorized access.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Performance&lt;/strong&gt;: Leverage the M2 chip&amp;rsquo;s capabilities for faster processing without relying on internet speeds or cloud server limitations.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Cost-Effective&lt;/strong&gt;: Avoid recurring costs associated with cloud-based AI services.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Customization&lt;/strong&gt;: Fine-tune models to better suit your specific needs and projects.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;With these advantages in mind, let&amp;rsquo;s get started on setting up your LLM.&lt;/p&gt;
&lt;h1 id=&#34;prerequisites&#34;&gt;Prerequisites&lt;/h1&gt;
&lt;p&gt;Before we begin, ensure you have the following:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;MacBook Air M2&lt;/strong&gt;: This guide is optimized for Apple&amp;rsquo;s M2 chip, taking full advantage of its architecture.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Homebrew Installed&lt;/strong&gt;: Homebrew is a package manager for macOS that simplifies the installation of software. If you haven&amp;rsquo;t installed it yet, you can do so by running the following command in your Terminal:
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;/bin/bash -c &lt;span class=&#34;s2&#34;&gt;&amp;#34;&lt;/span&gt;&lt;span class=&#34;k&#34;&gt;$(&lt;/span&gt;curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh&lt;span class=&#34;k&#34;&gt;)&lt;/span&gt;&lt;span class=&#34;s2&#34;&gt;&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Basic Terminal Knowledge&lt;/strong&gt;: Familiarity with using the Terminal will help streamline the setup process.&lt;/li&gt;
&lt;/ol&gt;
&lt;h1 id=&#34;step-1-install-ollama&#34;&gt;Step 1: Install Ollama&lt;/h1&gt;
&lt;p&gt;Ollama is a CLI tool designed to manage and run LLMs efficiently on macOS, especially optimized for Apple&amp;rsquo;s silicon.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Open Terminal&lt;/strong&gt;: You can find Terminal in your Applications &amp;gt; Utilities folder or by searching via Spotlight.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Install Ollama via Homebrew&lt;/strong&gt;:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;brew install ollama
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;This command downloads and installs the Ollama CLI. Homebrew handles dependencies, ensuring a smooth installation process.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Verify Installation&lt;/strong&gt;:
After installation, confirm that Ollama is correctly installed by checking its version:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;ollama --version
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;You should see the version number displayed, indicating a successful installation.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h1 id=&#34;step-2-choose-and-install-your-llm&#34;&gt;Step 2: Choose and Install Your LLM&lt;/h1&gt;
&lt;p&gt;Ollama supports a variety of language models. For this guide, we&amp;rsquo;ll use &lt;strong&gt;GPT-3&lt;/strong&gt;, but feel free to explore other models based on your needs.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;List Available Models&lt;/strong&gt;:
To see which models are available, you can visit Ollama&amp;rsquo;s &lt;a href=&#34;https://ollama.com/models&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;official repository&lt;/a&gt; or use the CLI to search.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Install the GPT-3 Model&lt;/strong&gt;:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;ollama install gpt3
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Replace &amp;ldquo;gpt3&amp;rdquo; with the specific model name if you&amp;rsquo;re opting for a different one. The installation process may take a few minutes as it downloads the necessary files.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Confirm Installation&lt;/strong&gt;:
Once installed, list your installed models to ensure GPT-3 is ready:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;ollama list
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;You should see GPT-3 listed among the available models.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h1 id=&#34;step-3-running-the-model-locally&#34;&gt;Step 3: Running the Model Locally&lt;/h1&gt;
&lt;p&gt;With Ollama and your chosen model installed, you&amp;rsquo;re ready to start interacting with the LLM.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Initiate the Model&lt;/strong&gt;:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;ollama run gpt3
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;This command launches the GPT-3 model. You&amp;rsquo;ll be prompted to enter your queries directly into the Terminal.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Interact with the Model&lt;/strong&gt;:
Simply type your question or prompt and press Enter. For example:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;ollama run gpt3
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&amp;gt; What are the benefits of using renewable energy?
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;GPT-3 will process your input and provide a comprehensive response right in your Terminal.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h1 id=&#34;step-4-customizing-model-parameters&#34;&gt;Step 4: Customizing Model Parameters&lt;/h1&gt;
&lt;p&gt;To tailor the model&amp;rsquo;s responses to better fit your requirements, you can adjust various parameters:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Temperature&lt;/strong&gt;: Controls the randomness of the output. Lower values make the output more deterministic, while higher values increase creativity.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Max Tokens&lt;/strong&gt;: Sets the maximum length of the response.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Top-p (Nucleus Sampling)&lt;/strong&gt;: Determines the diversity of the output by considering the cumulative probability.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Example Command with Custom Parameters&lt;/strong&gt;:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;ollama run gpt3 --temperature 0.7 --max-tokens &lt;span class=&#34;m&#34;&gt;150&lt;/span&gt; --top-p 0.9
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;This command sets a balanced temperature for creativity, limits the response length, and adjusts the nucleus sampling for more diverse outputs.&lt;/p&gt;
&lt;h1 id=&#34;step-5-automating-and-integrating-llm-usage&#34;&gt;Step 5: Automating and Integrating LLM Usage&lt;/h1&gt;
&lt;p&gt;Once you&amp;rsquo;re comfortable running the model manually, you might want to integrate it into your workflows or automate certain tasks.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Create Shell Scripts&lt;/strong&gt;:
You can write shell scripts to automate frequent queries or tasks. For example, create a script named &lt;code&gt;ask_ai.sh&lt;/code&gt;:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;cp&#34;&gt;#!/bin/bash
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;cp&#34;&gt;&lt;/span&gt;ollama run gpt3 --temperature 0.7 --max-tokens &lt;span class=&#34;m&#34;&gt;150&lt;/span&gt; --top-p 0.9 &lt;span class=&#34;s2&#34;&gt;&amp;#34;&lt;/span&gt;&lt;span class=&#34;nv&#34;&gt;$@&lt;/span&gt;&lt;span class=&#34;s2&#34;&gt;&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Make it executable:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;chmod +x ask_ai.sh
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Now, you can run &lt;code&gt;./ask_ai.sh &amp;quot;Your question here&amp;quot;&lt;/code&gt; to get responses quickly.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Integrate with IDEs or Text Editors&lt;/strong&gt;:
Enhance your development environment by integrating AI assistance directly into your coding workflow. Tools like Visual Studio Code can be configured to trigger scripts that interact with Ollama, providing real-time suggestions and code completions.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h1 id=&#34;troubleshooting-common-issues&#34;&gt;Troubleshooting Common Issues&lt;/h1&gt;
&lt;p&gt;Setting up complex tools can sometimes lead to unexpected challenges. Here are a few common issues and their solutions:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Ollama Command Not Found&lt;/strong&gt;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Ensure that Homebrew&amp;rsquo;s bin directory is in your PATH. You can add it by adding the following line to your &lt;code&gt;~/.zshrc&lt;/code&gt; or &lt;code&gt;~/.bash_profile&lt;/code&gt;:
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;nb&#34;&gt;export&lt;/span&gt; &lt;span class=&#34;nv&#34;&gt;PATH&lt;/span&gt;&lt;span class=&#34;o&#34;&gt;=&lt;/span&gt;&lt;span class=&#34;s2&#34;&gt;&amp;#34;/usr/local/bin:&lt;/span&gt;&lt;span class=&#34;nv&#34;&gt;$PATH&lt;/span&gt;&lt;span class=&#34;s2&#34;&gt;&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;Then, reload the profile:
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;nb&#34;&gt;source&lt;/span&gt; ~/.zshrc
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Model Installation Fails&lt;/strong&gt;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Check your internet connection and ensure sufficient disk space.&lt;/li&gt;
&lt;li&gt;Update Homebrew and Ollama:
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;brew update
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;brew upgrade ollama
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Performance Issues&lt;/strong&gt;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;While the M2 chip is powerful, ensure that no other intensive applications are running simultaneously.&lt;/li&gt;
&lt;li&gt;Monitor system resources using Activity Monitor to identify and mitigate bottlenecks.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h1 id=&#34;unlocking-the-full-potential-of-your-macbook-air-m2&#34;&gt;Unlocking the Full Potential of Your MacBook Air M2&lt;/h1&gt;
&lt;p&gt;By setting up an LLM locally on your MacBook Air M2 with Ollama, you&amp;rsquo;re not
just running AI models—you&amp;rsquo;re unlocking a new realm of possibilities. Whether
it&amp;rsquo;s automating content creation, enhancing your coding projects, or exploring
innovative AI-driven applications, the synergy between Ollama and the M2 chip
provides a robust foundation.&lt;/p&gt;
&lt;h1 id=&#34;benefits-recap&#34;&gt;Benefits Recap:&lt;/h1&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Speed&lt;/strong&gt;: Local processing eliminates latency, providing instantaneous responses.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Privacy&lt;/strong&gt;: Your data remains on your device, ensuring confidentiality.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Customization&lt;/strong&gt;: Fine-tune models to align perfectly with your project requirements.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Cost-Efficiency&lt;/strong&gt;: Reduce reliance on subscription-based AI services.&lt;/li&gt;
&lt;/ul&gt;
&lt;h1 id=&#34;final-thoughts&#34;&gt;Final Thoughts&lt;/h1&gt;
&lt;p&gt;The integration of powerful hardware like the MacBook Air M2 with efficient tools like Ollama democratizes access to advanced AI capabilities. Whether you&amp;rsquo;re a seasoned developer or a curious hobbyist, setting up an LLM locally empowers you to experiment, innovate, and create without the constraints of cloud dependencies.&lt;/p&gt;
&lt;p&gt;Ready to dive into the world of local AI? Follow the steps outlined above, and start harnessing the full potential of large language models right from your MacBook Air M2. The future of AI is at your fingertips!&lt;/p&gt;
&lt;h1 id=&#34;happy-coding&#34;&gt;Happy coding!&lt;/h1&gt;
</description>
    </item>
    
  </channel>
</rss>
