Revanth Reddy Tondapu
- Sep 19
- 4 min read

A Step-by-Step Guide to Integrate HubSpot with Neo4j Using Python

Updated: 6 days ago

Integrating HubSpot data with a Neo4j database can significantly enhance your business analytics and relationship management capabilities. This blog post provides a detailed, easy-to-follow guide for setting up this integration using Python, specifically leveraging object-oriented programming principles. We will cover everything from setting up API access to implementing full and incremental data syncs.

Step 1: Create a Virtual Environment

Let's start by setting up a virtual environment. This will help manage our dependencies and keep our project organized.

conda create -p venv python=3.10
conda activate venv/

Step 2: Install Required Packages

Next, create a requirements.txt file with the following content:

hubspot-api-client
neo4j
python-dotenv
requests

Then install the required packages using pip:

pip install -r requirements.txt

Step 3: Set Up Environment Variables

Create a .env file to securely store your API key and Neo4j credentials:

HUBSPOT_API_KEY=your_api_key_here
NEO4J_URI=bolt://localhost:7687
NEO4J_USERNAME=neo4j
NEO4J_PASSWORD=password

Make sure to load these environment variables in your Python scripts:

# file: config.py
from dotenv import load_dotenv
import os

load_dotenv()
hubspot_api_key = os.getenv("HUBSPOT_API_KEY")
neo4j_uri = os.getenv("NEO4J_URI")
neo4j_user = os.getenv("NEO4J_USERNAME")
neo4j_password = os.getenv("NEO4J_PASSWORD")

Step 4: Setting Up HubSpot API Access

Register Application

Visit the HubSpot Developer portal.
Register a new application to obtain your client ID and client secret.

Obtain API Keys or OAuth Tokens

API Keys: Generate API keys from your HubSpot account settings.
OAuth Tokens: Follow the OAuth 2.0 flow to obtain access tokens.

Step 5: Design Data Model for HubSpot Data

Identify Key Entities

Determine the key HubSpot entities you need to sync, such as Contacts and Companies.

Map Fields

For each entity, map the fields from HubSpot to your application. For instance, a Contact might have fields like id, first_name, last_name, and email.

Design Schema

Create a schema for these entities in your Neo4j database. For instance, you might have nodes for Contacts and Companies and relationships between them.

Step 6: Implement HubSpot Data Retrieval

Develop API Client

Create functions to retrieve data from HubSpot using the API key.

# file: hubspot_client.py
import requests
from config import hubspot_api_key

class HubSpotClient:

    def __init__(self):
        self.api_key = hubspot_api_key

    def get_contacts(self):
        url = "https://api.hubapi.com/crm/v3/objects/contacts"
        headers = {
            "Authorization": f"Bearer {self.api_key}"
        }
        response = requests.get(url, headers=headers)
        response.raise_for_status()
        return response.json().get('results', [])

    def get_companies(self):
        url = "https://api.hubapi.com/crm/v3/objects/companies"
        headers = {
            "Authorization": f"Bearer {self.api_key}"
        }
        response = requests.get(url, headers=headers)
        response.raise_for_status()
        return response.json().get('results', [])

Step 7: Sync HubSpot Data to Neo4j Using OOP

Define the Sync Class

Refactor the functions into a class to encapsulate the functionality.

# file: neo4j_sync.py
from neo4j import GraphDatabase
from config import neo4j_uri, neo4j_user, neo4j_password

class Neo4jSync:

    def __init__(self):
        self.driver = GraphDatabase.driver(neo4j_uri, auth=(neo4j_user, neo4j_password))

    def close(self):
        self.driver.close()

    def create_contact(self, tx, id, email, firstname, lastname, created_at, updated_at):
        tx.run("CREATE (c:Contact {id: $id, email: $email, firstname: $firstname, lastname: $lastname, createdAt: $createdAt, updatedAt: $updatedAt})",
               id=id, email=email, firstname=firstname, lastname=lastname, createdAt=created_at, updatedAt=updated_at)

    def create_company(self, tx, id, name, created_at, updated_at):
        tx.run("CREATE (c:Company {id: $id, name: $name, createdAt: $createdAt, updatedAt: $updatedAt})",
               id=id, name=name, createdAt=created_at, updatedAt=updated_at)

    def sync_contacts(self, contacts):
        with self.driver.session() as session:
            for contact in contacts:
                properties = contact.get('properties', {})
                session.write_transaction(self.create_contact,
                                          id=contact.get('id'),
                                          email=properties.get('email'),
                                          firstname=properties.get('firstname'),
                                          lastname=properties.get('lastname'),
                                          created_at=properties.get('createdate'),
                                          updated_at=properties.get('lastmodifieddate'))

    def sync_companies(self, companies):
        with self.driver.session() as session:
            for company in companies:
                properties = company.get('properties', {})
                session.write_transaction(self.create_company,
                                          id=company.get('id'),
                                          name=properties.get('name'),
                                          created_at=properties.get('createdate'),
                                          updated_at=properties.get('lastmodifieddate'))

Main Function

Create a main function to instantiate and use the HubSpotClient and Neo4jSync classes.

# file: main.py
from hubspot_client.py import HubSpotClient
from neo4j_sync.py import Neo4jSync

def main():
    hubspot_client = HubSpotClient()
    neo4j_sync = Neo4jSync()

    # Fetch contacts from HubSpot and sync to Neo4j
    contacts = hubspot_client.get_contacts()
    neo4j_sync.sync_contacts(contacts)

    # Fetch companies from HubSpot and sync to Neo4j
    companies = hubspot_client.get_companies()
    neo4j_sync.sync_companies(companies)

    neo4j_sync.close()
    print("Data sync completed successfully.")

if __name__ == "__main__":
    main()

Step 8: Run the Code

After setting up your environment and writing the necessary code, it's time to run the script to fetch data from HubSpot and sync it to your Neo4j database. Here’s a quick recap of the files and their contents:

config.py: Handles loading environment variables.
hubspot_client.py: Contains the HubSpotClient class for interacting with the HubSpot API.
neo4j_sync.py: Contains the Neo4jSync class for syncing data to Neo4j.
main.py: The entry point to run the entire process.

Running the Script

Ensure all your environment variables are correctly set in the .env file. Then, simply run the main.py script to start the data synchronization process.

Open your terminal.
Activate your virtual environment (if not already activated):

conda activate venv/

3. Run the main.py script:

python main.py

Expected Output

If everything is set up correctly, you should see:

Data sync completed successfully.

This message indicates that the data has been successfully fetched from HubSpot and inserted into your Neo4j database.

Conclusion

Integrating HubSpot with Neo4j using Python can provide powerful insights and enhanced data management capabilities for your business. By following the steps outlined in this guide, you can set up a robust and efficient sync mechanism that keeps your CRM data up-to-date and easily accessible for analysis and decision-making.

Starting from creating a virtual environment to managing dependencies and securely storing API keys, we have covered all the essential steps to get you started. The detailed instructions on setting up API access, designing a data model, and implementing both full and incremental syncs ensure that you have a comprehensive understanding of the integration process.

By leveraging the power of Neo4j's graph database and HubSpot's comprehensive CRM capabilities, you can unlock new opportunities for data visualization and relationship management. This integration not only helps in maintaining data consistency but also enables advanced queries and analytics that can drive business growth.

Whether you are a developer looking to streamline data operations or a business analyst aiming to derive more value from CRM data, this guide equips you with the necessary tools and knowledge to achieve your goals. Happy coding!