Configuration¶
All configuration is done through environment variables.
Required Variables¶
| Variable | Description |
|---|---|
DATAHUB_URL |
DataHub GMS URL (e.g., https://datahub.company.com) |
DATAHUB_TOKEN |
Personal access token from DataHub |
Optional Variables¶
| Variable | Description | Default |
|---|---|---|
DATAHUB_TIMEOUT |
Request timeout in seconds | 30 |
DATAHUB_RETRY_MAX |
Maximum retry attempts | 3 |
DATAHUB_DEFAULT_LIMIT |
Default search result limit | 10 |
DATAHUB_MAX_LIMIT |
Maximum allowed limit | 100 |
DATAHUB_MAX_LINEAGE_DEPTH |
Maximum lineage traversal depth | 5 |
DATAHUB_CONNECTION_NAME |
Display name for primary connection | datahub |
DATAHUB_ADDITIONAL_SERVERS |
JSON map of additional servers | (empty) |
Example Configuration¶
# Required
export DATAHUB_URL=https://datahub.company.com
export DATAHUB_TOKEN=your_personal_access_token
# Optional tuning
export DATAHUB_TIMEOUT=60
export DATAHUB_DEFAULT_LIMIT=20
export DATAHUB_MAX_LIMIT=50
Multi-Server Configuration¶
Connect to multiple DataHub instances simultaneously. Useful for:
- Production and staging environments
- Multi-tenant deployments
- Cross-environment metadata comparison
Setting Up Multiple Servers¶
# Primary server configuration
export DATAHUB_URL=https://prod.datahub.example.com/api/graphql
export DATAHUB_TOKEN=prod-token
export DATAHUB_CONNECTION_NAME=prod # Optional: customize display name
# Additional servers as JSON
export DATAHUB_ADDITIONAL_SERVERS='{
"staging": {
"url": "https://staging.datahub.example.com/api/graphql",
"token": "staging-token"
},
"dev": {
"url": "https://dev.datahub.example.com/api/graphql"
}
}'
Additional Server Options¶
Each additional server can override these settings (inherits from primary if not specified):
| Field | Description |
|---|---|
url |
DataHub GMS URL (required) |
token |
Access token (inherits from primary) |
timeout |
Request timeout in seconds |
retry_max |
Maximum retry attempts |
default_limit |
Default search limit |
max_limit |
Maximum allowed limit |
max_lineage_depth |
Maximum lineage depth |
Using Multiple Servers¶
- Use
datahub_list_connectionsto see available connections - Pass the
connectionparameter to any tool to target a specific server - If
connectionis omitted, the default (primary) server is used
Getting a DataHub Token¶
- Log into DataHub
- Go to Settings > Access Tokens
- Generate a new token with appropriate permissions
- Copy the token value
Security Considerations¶
- Never commit tokens to version control
- Use environment variables or secret management
- Tokens should have minimal required permissions
- Rotate tokens periodically