Adapter Configuration

This guide covers all the database adapters supported by DWH and their specific configuration options. Each adapter is designed to work with specific database clients and provides database-specific optimizations.

PostgreSQL Adapter

The PostgreSQL adapter uses the pg gem and provides full-featured RDBMS support.

Basic Configuration

postgres = DWH.create(:postgres, {
  host: 'localhost',
  port: 5432,                    # Default: 5432
  database: 'mydb',
  schema: 'public',              # Default: 'public'
  username: 'user',
  password: 'password',
  client_name: 'My Application'  # Default: 'DWH Ruby Gem'
})

SSL Configuration

# Basic SSL
postgres = DWH.create(:postgres, {
  host: 'localhost',
  database: 'mydb',
  username: 'user',
  password: 'password',
  ssl: true,
  extra_connection_params: {
    sslmode: 'require'  # disable, prefer, require, verify-ca, verify-full
  }
})

# Certificate-based SSL
postgres = DWH.create(:postgres, {
  host: 'localhost',
  database: 'mydb',
  username: 'user',
  ssl: true,
  extra_connection_params: {
    sslmode: 'verify-full',
    sslrootcert: '/path/to/ca-cert.pem',
    sslcert: '/path/to/client-cert.pem',
    sslkey: '/path/to/client-key.pem'
  }
})

Advanced Configuration

postgres = DWH.create(:postgres, {
  host: 'localhost',
  database: 'mydb',
  username: 'user',
  password: 'password',
  query_timeout: 3600,  # seconds, default: 3600
  extra_connection_params: {
    application_name: 'Data Analysis Tool',
    connect_timeout: 10,
    options: '-c maintenance_work_mem=256MB'
  }
})

Redshift Adapter

The Redshift adapter uses the pg gem and provides full-featured RDBMS support.

Basic Configuration

redshift = DWH.create(:redshift, {
  host: 'localhost',
  port: 5432,                    # Default: 5432
  database: 'mydb',
  schema: 'public',              # Default: 'public'
  username: 'user',
  password: 'password',
  client_name: 'My Application'  # Default: 'DWH Ruby Gem'
})

SSL Configuration

# Basic SSL
redshift = DWH.create(:redshift, {
  host: 'localhost',
  database: 'mydb',
  username: 'user',
  password: 'password',
  ssl: true,
  extra_connection_params: {
    sslmode: 'require'  # disable, prefer, require, verify-ca, verify-full
  }
})

# Certificate-based SSL
redshift = DWH.create(:postgres, {
  host: 'localhost',
  database: 'mydb',
  username: 'user',
  ssl: true,
  extra_connection_params: {
    sslmode: 'verify-full',
    sslrootcert: '/path/to/ca-cert.pem',
    sslcert: '/path/to/client-cert.pem',
    sslkey: '/path/to/client-key.pem'
  }
})

Advanced Configuration

redshift = DWH.create(:redshift, {
  host: 'localhost',
  database: 'mydb',
  username: 'user',
  password: 'password',
  query_timeout: 3600,  # seconds, default: 3600
  extra_connection_params: {
    application_name: 'Data Analysis Tool',
    connect_timeout: 10,
    options: '-c maintenance_work_mem=256MB'
  }
})

Snowflake

Snowflake adapter use the REST apis (https) to connect and query. This adapter also supports Multi-Database authentication methods: Personal Access Token, Key Pair, and OAuth.

Basic connection with Personal Access Token

DWH.create(:snowflake, {
    auth_mode: 'pat',
    account_identifier: 'myorg-myaccount',
    personal_access_token: 'your-token-here',
    warehouse: 'COMPUTE_WH',
    database: 'ANALYTICS',
    schema: 'PUBLIC'
})

Connection with Key Pair Authentication

 DWH.create(:snowflake, {
    auth_mode: 'kp',
    account_identifier: 'myorg-myaccount.us-east-1',
    username: 'john_doe',
    private_key: '/path/to/private_key.pem',
    warehouse: 'COMPUTE_WH',
    database: 'ANALYTICS'
})

Connecting with OAuth

This is the Snowflake OAuth mechanism. Not the External one. You must first create an OAuth security integration and apply to the releveant roles. Follow this document for more.

adapter = DWH.create(:snowflake, {
  auth_mode: 'oauth',
  account_identifier: 'myorg-myaccount.us-east-1',
  oauth_client_id: '<YOUR_CLIENT_ID>',
  oauth_client_secret: '<YOUR_CLIENT_SECRET>',
  oauth_redirect_url: 'https://localhost:3030/some/path',
  database: 'ANALYTICS',
  client_name: 'myapp' # sent as user agent header value
})

To successfully use OAuth you have to pass the adapter valid access and refresh tokens. Or, it can generate them from a valid authorization code.

The typical flow is like so:

  1. Generate an authorization code by visiting the url generated by adapter.authorization_url. This will redirect to the configured oauth_redirect_url. You must be able to retrieve the code from there.
  2. Take the code from above and generate new access tokens: adapter.generate_oauth_tokens(code). This will return Hash with access_token and refresh_token. You can cache and reuse this until the refresh_token gets expired. This method will also apply the token to the current adapter instance.
  3. You can apply an existing set of tokens like so:adapter.apply_oauth_tokens(access_token: token, refresh_token: token, expires_at: Time.now)

MySQL Adapter

The MySQL adapter uses the mysql2 gem. Note that MySQL's concept of "database" maps to "schema" in DWH.

Basic Configuration

mysql = DWH.create(:mysql, {
  host: '127.0.0.1',            # Use 127.0.0.1 for local Docker instances
  port: 3306,                   # Default: 3306
  database: 'mydb',
  username: 'user',
  password: 'password',
  client_name: 'My Application' # Default: 'DWH Ruby Gem'
})

SSL Configuration

# Basic SSL
mysql = DWH.create(:mysql, {
  host: '127.0.0.1',
  database: 'mydb',
  username: 'user',
  password: 'password',
  ssl: true,  # Defaults ssl_mode to 'required'
  extra_connection_params: {
    ssl_mode: 'verify_identity',  # disabled, preferred, required, verify_ca, verify_identity
    sslca: '/path/to/ca-cert.pem',
    sslcert: '/path/to/client-cert.pem',
    sslkey: '/path/to/client-key.pem'
  }
})

Advanced Configuration

mysql = DWH.create(:mysql, {
  host: 'mysql.example.com',
  database: 'analytics',
  username: 'analyst',
  password: 'password',
  client_name: "My App", # defaults to 'DWH Ruby Gem'
  query_timeout: 1800,  # seconds, default: 3600
  extra_connection_params: {
    encoding: 'utf8mb4',
    read_timeout: 60,
    write_timeout: 60,
    connect_timeout: 10
  }
})

SQL Server Adapter

The SQL Server adapter uses the tiny_tds gem and supports both on-premises and Azure SQL Server.

Basic Configuration

sqlserver = DWH.create(:sqlserver, {
  host: 'localhost',
  port: 1433,                   # Default: 1433
  database: 'mydb',
  username: 'sa',
  password: 'password',
  client_name: 'My Application' # Default: 'DWH Ruby Gem'
})

Azure SQL Server

azure_sql = DWH.create(:sqlserver, {
  host: 'myserver.database.windows.net',
  database: 'mydb',
  username: 'myuser@myserver',
  password: 'password',
  azure: true,
  client_name: 'My Application'
})

Advanced Configuration

sqlserver = DWH.create(:sqlserver, {
  host: 'sql.example.com',
  database: 'analytics',
  username: 'analyst',
  password: 'password',
  query_timeout: 1800,  # seconds, default: 3600
  extra_connection_params: {
    container: true,     # For SQL Server running in containers
    use_utf16: false,    # Character encoding options
    timeout: 60,         # Connection timeout
    login_timeout: 60    # Login timeout
  }
})

Multi-Database Operations

# List tables in another database
tables = sqlserver.tables(catalog: 'other_database')

# Get metadata for table in another database
 = sqlserver.('other_database.dbo.my_table')
# OR
 = sqlserver.('my_table', catalog: 'other_database')

DuckDB Adapter

The DuckDB adapter uses the ruby-duckdb gem for in-process analytical queries. This requires DuckDB header files and library to already be installed.

Basic Configuration

# File-based database
duckdb = DWH.create(:duckdb, {
  file: '/path/to/my/database.duckdb',
  schema: 'main'  # Default: 'main'
})

# In-memory database
duckdb = DWH.create(:duckdb, {
  file: ':memory:'
})

Read-Only Mode

duckdb = DWH.create(:duckdb, {
  file: '/path/to/readonly/database.duckdb',
  duck_config: {
    access_mode: 'READ_ONLY'
  }
})

Advanced Configuration

duckdb = DWH.create(:duckdb, {
  file: '/path/to/my/database.duckdb',
  duck_config: {
    access_mode: 'READ_WRITE',
    max_memory: '2GB',
    threads: 4,
    temp_directory: '/tmp/duckdb'
  }
})

SQLite Adapter

The SQLite adapter uses the sqlite3 gem for lightweight embedded database analytics. It's optimized for analytical workloads with WAL mode enabled by default for better concurrent read performance.

Basic Configuration

# File-based database
sqlite = DWH.create(:sqlite, {
  file: '/path/to/my/database.sqlite'
})

# In-memory database
sqlite = DWH.create(:sqlite, {
  file: ':memory:'
})

Read-Only Mode

sqlite = DWH.create(:sqlite, {
  file: '/path/to/readonly/database.sqlite',
  readonly: true
})

Performance Optimization

The adapter includes default optimizations for analytical workloads:

  • WAL mode enabled by default for concurrent reads
  • 64MB cache size
  • Memory-mapped I/O (128MB)
  • Temp tables stored in memory
# Customize performance settings
sqlite = DWH.create(:sqlite, {
  file: '/path/to/my/database.sqlite',
  timeout: 5000,  # busy timeout in milliseconds, default: 5000
  pragmas: {
    cache_size: -128000,      # 128MB cache (negative means KB)
    mmap_size: 268435456,     # 256MB memory-mapped I/O
    temp_store: 'MEMORY',     # Store temp tables in memory
    synchronous: 'NORMAL'     # Faster than FULL, safe with WAL
  }
})

Disable WAL Mode

# Disable WAL mode if needed (e.g., for NFS or network filesystems)
sqlite = DWH.create(:sqlite, {
  file: '/path/to/my/database.sqlite',
  enable_wal: false
})

Advanced Configuration

sqlite = DWH.create(:sqlite, {
  file: '/path/to/analytics.sqlite',
  readonly: false,
  enable_wal: true,           # Default: true
  timeout: 10000,             # 10 second busy timeout
  pragmas: {
    journal_mode: 'WAL',      # Explicitly set WAL (done by default)
    cache_size: -256000,      # 256MB cache
    page_size: 8192,          # Larger page size for analytics
    mmap_size: 536870912,     # 512MB memory-mapped I/O
    temp_store: 'MEMORY',     # Keep temp data in memory
    synchronous: 'NORMAL',    # Balance between safety and speed
    locking_mode: 'NORMAL'    # Allow multiple connections
  }
})

Multiple Connections

Unlike DuckDB, SQLite allows multiple independent connections to the same database file:

# Multiple readers/writers to the same file
reader = DWH.create(:sqlite, { file: '/path/to/data.sqlite', readonly: true })
writer = DWH.create(:sqlite, { file: '/path/to/data.sqlite' })

# Both can operate concurrently with WAL mode enabled
data = reader.execute('SELECT * FROM sales')
writer.execute('INSERT INTO sales VALUES (...)')

Trino Adapter

The Trino adapter requires the trino-client-ruby gem and works with both Trino and Presto.

Basic Configuration

trino = DWH.create(:trino, {
  host: 'localhost',
  port: 8080,                   # Default: 8080
  catalog: 'hive',              # Required
  schema: 'default',            # Optional
  username: 'analyst',
  password: 'password',         # Optional
  client_name: 'My Application' # Default: 'DWH Ruby Gem'
})

SSL Configuration

trino = DWH.create(:trino, {
  host: 'trino.example.com',
  port: 443,
  ssl: true, # will set {ssl: {verify: false}}
  catalog: 'hive',
  username: 'analyst',
  password: 'password',
  client_name: "My App"
})

Advanced Configuration with Headers

trino = DWH.create(:trino, {
  host: 'trino.example.com',
  port: 8080,
  catalog: 'delta_lake',
  schema: 'analytics',
  username: 'analyst',
  query_timeout: 1800,  # seconds, default: 3600
  extra_connection_params: {
    http_headers: {
      'X-Trino-User' => 'Real User Name',
      'X-Trino-Source' => 'Analytics Dashboard',
      'X-Forwarded-Request' => 'client-request-id'
    },
    ssl: {
      verify: true,
    }
  }
})

Apache Druid Adapter

The Druid adapter uses HTTP API calls via the faraday gem for real-time analytics.

Basic Configuration

druid = DWH.create(:druid, {
  protocol: 'http',             # 'http' or 'https'
  host: 'localhost',
  port: 8080,                   # Default: 8081
  client_name: 'My Application' # Default: 'DWH Ruby Gem'
})

HTTPS with Basic Authentication

druid = DWH.create(:druid, {
  protocol: 'https',
  host: 'druid.example.com',
  port: 443,
  basic_auth: 'base64_encoded_credentials',  # Base64 encoded username:password
  query_timeout: 600,          # seconds, default: 600
  open_timeout: 30             # connection timeout, default: nil
})

Advanced Configuration with Context

druid = DWH.create(:druid, {
  protocol: 'https',
  host: 'druid.example.com',
  port: 8080,
  basic_auth: 'dXNlcjpwYXNz',  # base64 for 'user:pass'
  extra_connection_params: {
    context: {
      user: 'analyst_name',
      team: 'data_engineering',
      priority: 10,
      useCache: true
    }
  }
})

AWS Athena Adapter

The Athean adapter requires the aws-athena-sdk gem and works with both Trino and Presto.

Basic Configuration

athena = DWH.create(:athena, {
    region: 'us-east-1',
    database: 'default',
    s3_output_location: 's3://my-athena-results-bucket/queries/',
    access_key_id: 'AKIAIOSFODNN7EXAMPLE',
    secret_access_key: 'wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY',
    catalog: 'hive',       # optional will default to awsdatacatalog
    database: 'default',   # Optional. Db or schema
    workgroup: 'my-dept-strata' # optional workgroup
})

SSL Configuration

athena = DWH.create(:athena, {
    region: 'us-east-1',
    database: 'default',
    s3_output_location: 's3://my-athena-results-bucket/queries/',
    access_key_id: 'AKIAIOSFODNN7EXAMPLE',
    secret_access_key: 'wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY',
    catalog: 'hive',       # optional will default to awsdatacatalog
    database: 'default',   # Optional. Db or schema
    workgroup: 'my-dept-strata', # optional workgroup
    extra_connection_params: {
      ssl_ca_directory: 'path/to/certs/'
    }
})

Advanced Configuration with Headers

See full list of config options here: athena-api

Configuration Validation

DWH validates configuration parameters at creation time:

begin
  adapter = DWH.create(:postgres, { host: 'localhost' })  # Missing required database
rescue DWH::ConfigError => e
  puts "Configuration error: #{e.message}"
end

Each adapter defines required and optional parameters with validation rules. Check the adapter-specific sections above for the complete list of supported parameters.