Google Spanner
Spanner is a highly scalable database that combines unlimited scalability with relational semantics, such as secondary indexes, strong consistency, schemas, and SQL providing 99.999% availability in one easy solution.
This notebook goes over how to use Spanner to save, load and delete langchain documents with SpannerLoader
and SpannerDocumentSaver
.
Learn more about the package on GitHub.
Before You Begin
To run this notebook, you will need to do the following:
- Create a Google Cloud Project
- Enable the Cloud Spanner API
- Create a Spanner instance
- Create a Spanner database
- Create a Spanner table
After confirmed access to database in the runtime environment of this notebook, filling the following values and run the cell before running example scripts.
# @markdown Please specify an instance id, a database, and a table for demo purpose.
INSTANCE_ID = "test_instance" # @param {type:"string"}
DATABASE_ID = "test_database" # @param {type:"string"}
TABLE_NAME = "test_table" # @param {type:"string"}
🦜🔗 Library Installation
The integration lives in its own langchain-google-spanner
package, so we need to install it.
%pip install -upgrade --quiet langchain-google-spanner langchain
Colab only: Uncomment the following cell to restart the kernel or use the button to restart the kernel. For Vertex AI Workbench you can restart the terminal using the button on top.
# # Automatically restart kernel after installs so that your environment can access the new packages
# import IPython
# app = IPython.Application.instance()
# app.kernel.do_shutdown(True)
☁ Set Your Google Cloud Project
Set your Google Cloud project so that you can leverage Google Cloud resources within this notebook.
If you don't know your project ID, try the following:
- Run
gcloud config list
. - Run
gcloud projects list
. - See the support page: Locate the project ID.
# @markdown Please fill in the value below with your Google Cloud project ID and then run the cell.
PROJECT_ID = "my-project-id" # @param {type:"string"}
# Set the project id
!gcloud config set project {PROJECT_ID}
🔐 Authentication
Authenticate to Google Cloud as the IAM user logged into this notebook in order to access your Google Cloud Project.
- If you are using Colab to run this notebook, use the cell below and continue.
- If you are using Vertex AI Workbench, check out the setup instructions here.
from google.colab import auth
auth.authenticate_user()
Basic Usage
Save documents
Save langchain documents with SpannerDocumentSaver.add_documents(<documents>)
. To initialize SpannerDocumentSaver
class you need to provide 3 things:
instance_id
- An instance of Spanner to load data from.database_id
- An instance of Spanner database to load data from.table_name
- The name of the table within the Spanner database to store langchain documents.
from langchain_core.documents import Document
from langchain_google_spanner import SpannerDocumentSaver
test_docs = [
Document(
page_content="Apple Granny Smith 150 0.99 1",
metadata={"fruit_id": 1},
),
Document(
page_content="Banana Cavendish 200 0.59 0",
metadata={"fruit_id": 2},
),
Document(
page_content="Orange Navel 80 1.29 1",
metadata={"fruit_id": 3},
),
]
saver = SpannerDocumentSaver(
instance_id=INSTANCE_ID,
database_id=DATABASE_ID,
table_name=TABLE_NAME,
)
saver.add_documents(test_docs)