Hello everyone! In this blog post, we'll explore how to get started with the Gemini API using Python. Our main objective is to interact with various Gemini models using the API keys we've created. We'll be covering everything from setting up your development environment to generating text and handling multi-modal inputs. By the end of this guide, you'll have a solid foundation to build more complex projects using the Gemini API.
Setting Up Your Development Environment
Prerequisites
Before we dive in, make sure you have the following:
Python Version: Ensure you have Python 3.9 or higher installed.
API Key: Make sure you've created your API key. If not, check out my other blog on how to create an API key.
Setting Up in Google Colab
Google Colab is a great environment for quick experimentation and collaboration. Here’s how to set it up:
Create a New Notebook: Go to Google Colab and create a new notebook.
Install the Required Packages: Run the following command to install the Gemini API SDK:
!pip install -q -U google-generativeai
Import Necessary Libraries:
# Import the pathlib module for filesystem path manipulations
import pathlib
# Import the textwrap module for text formatting
import textwrap
# Import the google.generativeai module for interacting with Google's generative AI services
import google.generativeai as genai
# Import display and Markdown from IPython.display to display formatted output in Jupyter notebooks
from IPython.display import display, Markdown
# Define a function to convert text to Markdown format with custom modifications
def to_markdown(text):
# Replace each period in the text with '..*'
text = text.replace('.', '..*')
# Indent each line of the text with a '>' character to format it as a blockquote in Markdown
return Markdown(textwrap.indent(text, '>', predicate=lambda _: True))
# Example usage (commented out):
# markdown_text = to_markdown("This is a test sentence. This is another sentence.")
# display(markdown_text)
Set Up API Key: Store your API key securely in Colab.
# Import the userdata module from Google Colab
from google.colab import userdata
# Retrieve the Google API key stored in Colab's userdata
google_api_key = userdata.get('GOOGLE_API_KEY')
genai.configure(api_key=google_api_key)
List of models provided by the generative AI
# Iterate through the list of models provided by the generative AI service
for m in genai.list_models():
# Check if 'generateContent' is in the list of supported generation methods for the model
if 'generateContent' in m.supported_generation_methods:
# Print the name of the model
print(m.name)
Initialize the generative model: The GenerativeModel class is instantiated with the model name 'gemini-pro'.
# Initialize the generative model with the specified model name 'gemini-pro'
model = genai.GenerativeModel('gemini-pro')
Display the model information: The model object is displayed to show its details.
# Display the model information
model
Measure the execution time: The %%time magic command is used to measure the time taken for the subsequent code block to execute.Generate content: The generate_content method is called with the prompt "What is the meaning of Man" to generate a response.
# Measure the time taken to generate the content for the given prompt
%%time
# Generate content based on the prompt "What is the meaning of Man"
response = model.generate_content("What is the meaning of Man?")
Convert to Markdown: The generated text is converted to Markdown format and displayed.
# Convert the generated text to Markdown format and display it
to_markdown(response.text)
Display prompt feedback: The feedback related to the prompt is displayed.
# Display feedback related to the prompt
response.prompt_feedback
Display candidate responses: All candidate responses generated by the model are displayed.
# Display all candidate responses generated by the model
response.candidates
Measure execution time with streaming: The %%time magic command is again used to measure the time taken for the streaming generation. Generate content with streaming: The generate_content method is called with streaming enabled, allowing for partial responses to be processed as they are generated.
# Measure the time taken to generate the content with streaming enabled
%%time
# Generate content for the prompt "What is the meaning of Man" with streaming enabled
response = model.generate_content("What is the meaning of Man", stream=True)
Iterate through streamed responses: The streamed response chunks are iterated over, and the text of each chunk is printed along with a separator line for better readability.
# Iterate through the streamed response chunks and print the text of each chunk
for chunk in response:
print(chunk.text)
print("_" * 80) # Print a separator line for better readability
Summary of Key Steps
Create a New Notebook: Go to Google Colab and create a new notebook.
Install the Required Packages: Run the command to install the Gemini API SDK.
Import Necessary Libraries: Import the required libraries for filesystem manipulations, text formatting, interacting with Google's generative AI services, and displaying formatted output.
Define a Function for Markdown Conversion: Create a function to convert text to Markdown format with custom modifications.
Set Up API Key: Store and retrieve your API key securely in Colab.
List Models Provided by the Generative AI: Iterate through and print the list of models provided by the generative AI service.
Initialize the Generative Model: Instantiate the GenerativeModel class with the specified model name.
Display the Model Information: Display the model object to show its details.
Measure the Execution Time: Use the %%time magic command to measure the time taken for the subsequent code block to execute.
Generate Content: Call the generate_content method with the prompt to generate a response.
Convert to Markdown and Display: Convert the generated text to Markdown format and display it.
Display Prompt Feedback: Display the feedback related to the prompt.
Display Candidate Responses: Display all candidate responses generated by the model.
Generate Content with Streaming: Use the %%time magic command to measure the time taken for streaming generation and call the generate_content method with streaming enabled.
Iterate Through Streamed Responses: Iterate through the streamed response chunks and print the text of each chunk with a separator line for better readability.
Setting Up in Local Environment Using Visual Studio Code
If you prefer to work locally, here’s how to set it up using Visual Studio Code:
1. Create a Project Directory
Create a directory for your project and navigate into it:
mkdir my_project
cd my_project
2. Create a Virtual Environment
Create a virtual environment to manage your project dependencies:
conda create -p venv python=3.10
conda activate venv/
3. Create a requirements.txt File
Create a requirements.txt file in your project directory to list the necessary dependencies:
google-generativeai
textwrap3
pathlib
IPython
python-dotenv
4. Install Required Packages
Install the packages listed in requirements.txt:
pip install -r requirements.txt
5. Create a .env File
Create a .env file in your project directory and add your GOOGLE_API_KEY:
GOOGLE_API_KEY=your_api_key_here
6. Create a Python Script
Create a new Python script file (e.g., main.py) in your project directory. Open this file in Visual Studio Code.
6. Write the Code
Copy the following code into your main.py file:
# Import the necessary libraries
import pathlib
import textwrap
import google.generativeai as genai
from IPython.display import display, Markdown
from dotenv import load_dotenv
import os
import time
# Load environment variables from .env file
load_dotenv()
# Define a function to convert text to Markdown format with custom modifications
def to_markdown(text):
# Replace each period in the text with '..*'
text = text.replace('.', '..*')
# Indent each line of the text with a '>' character to format it as a blockquote in Markdown
return Markdown(textwrap.indent(text, '>', predicate=lambda _: True))
# Set up your API key
google_api_key = os.getenv('GOOGLE_API_KEY')
if not google_api_key:
raise ValueError("GOOGLE_API_KEY not found in environment variables.")
genai.configure(api_key=google_api_key)
# List and print models provided by the generative AI service
for m in genai.list_models():
if 'generateContent' in m.supported_generation_methods:
print(m.name)
# Initialize the generative model with the specified model name 'gemini-pro'
model = genai.GenerativeModel('gemini-pro')
# Display the model information
print(model)
# Measure the execution time
start_time = time.time()
# Generate content based on the prompt "What is the meaning of Man?"
response = model.generate_content("What is the meaning of Man?")
# Print the time taken to generate the content
print("Time taken: %s seconds" % (time.time() - start_time))
# Convert the generated text to Markdown format and display it
print(to_markdown(response.text))
# Display feedback related to the prompt
print(response.prompt_feedback)
# Display all candidate responses generated by the model
print(response.candidates)
# Measure execution time with streaming
start_time = time.time()
# Generate content for the prompt "What is the meaning of Man?" with streaming enabled
response = model.generate_content("What is the meaning of Man?", stream=True)
# Print the time taken to generate the content
print("Time taken with streaming: %s seconds" % (time.time() - start_time))
# Iterate through the streamed response chunks and print the text of each chunk
for chunk in response:
print(chunk.text)
print("_" * 80) # Print a separator line for better readability
7. Run the Script
Open a terminal in Visual Studio Code and run your script:
python main.py
Summary of Key Steps
Create a Project Directory: Create and navigate into your project directory.
Create a Virtual Environment: Use python -m venv myenv and activate it.
Create a requirements.txt File: List the necessary dependencies in requirements.txt.
Install Required Packages: Install packages using pip install -r requirements.txt.
Create and Write the Code: Create a Python script (main.py) and add the provided code.
Run the Script: Execute the script in the terminal using python main.py.
By following these steps, you can replicate the setup you had in Google Colab in your local environment using Visual Studio Code. This will allow you to experiment with Google Gemini models locally.
Conclusion
In this blog post, we covered the basics of setting up your development environment and interacting with the Gemini API using Python. We explored generating text and handling multi-modal inputs, and we also looked at how to manage safety settings and stream responses.
Understanding these fundamentals will prepare you for more complex projects that leverage the power of the Gemini models. Stay tuned for more advanced tutorials and end-to-end project examples.
Thank you for reading, and happy coding!
Comments