NMDC Data Portal
The data portal is not a system of record for NMDC Data. It is a transformed copy of the system of record from MongoDB, transformed into a relational schema optimized for the sort of queries that the data portal client needs to make.
The NMDC Data Portal is build with these technologies
Python and FastAPI
PostgreSQL and SQLAlchemy
Celery and Redis Queue
Vue JS and Vuetify
Software versions
General architecture
nmdc-diagram
API Documentation
Information about how to use the search portal REST API can be found in the wiki.
Development documentation
Data Ingest
The ingestion procedure is a two step process.
Data is ingested into a staging database
The staging and production database are swapped in spin
The steps to perform an ingest are as follows:
Step 1. Ingest into the staging database
Execute the POST /api/jobs/ingest
endpoint through the swagger docs.
This endpoint requires being logged in and having a whitelisted ORCiD. To add new users to the whitelist, see the code block in auth.py
.
This job will take several hours to complete. You should verify it completed successfully by looking at the logs in the worker before moving on.
Step 2. Modify environment variables to swap prod/staging
In the rancher2 UI, select Resources->Secrets
from the toolbar and click on the postgres
secret group. Now, click on the button in the upper right with three vertical dots and select Edit
. Now, swap the values under INGEST_URI
and POSTGRES_URI
and click save.
Step 3. Restart the containers
Go back to the workloads page and redeploy both the backend
and worker
services. If the site doesn’t work with the newest data, you can always revert the changes to the secrets provided you haven’t started a new ingest.