Remote Folder (SFTP)
This section explains how to create, configure, and test a Remote Folder data source in DataDios. Remote Folder allows you to connect to file systems on remote servers via SSH/SFTP protocol.
Overview
The Remote Folder data source enables secure access to files and directories on remote Linux/Unix servers. It supports:
- Multiple authentication methods: Password or SSH private keys (RSA, ECDSA, Ed25519, DSA)
- Various file formats: CSV, JSON, Parquet, ORC, Avro, XML, TSV, and more
- Hierarchical browsing: Navigate through folder structures
- Metadata extraction: View file attributes and data schemas
Steps to Create and Test a Remote Folder Data Source
Step 1: Create a Data Source
- Navigate to the Data Sources tab in DataDios
- Click + CREATE DS
- From the list of available data source types, select Remote Folder
Step 2: Fill Connection Details
In the Connection Details form, provide the required parameters:
Required Parameters
| Parameter | Description | Example |
|---|---|---|
| Host | Remote server hostname or IP address | 192.168.1.100 or myserver.example.com |
| Username | SSH username for authentication | ubuntu, ec2-user |
| Folder Path | Absolute path to the folder on remote server | /home/user/data or /var/data/files |
Authentication (Choose One)
Option A: Password Authentication
| Parameter | Description |
|---|---|
| Password | SSH password for the user |
Option B: SSH Private Key Authentication
| Parameter | Description |
|---|---|
| PEM Data | SSH private key content (paste the entire key including BEGIN/END headers) |
| Passphrase | (Optional) Passphrase if the private key is encrypted |
DataDios supports multiple SSH key algorithms:
- RSA (most common)
- ECDSA (elliptic curve)
- Ed25519 (modern, recommended)
- DSA (legacy)
Optional Parameters
| Parameter | Description | Example |
|---|---|---|
| Group | Grouping for organizing data sources | Production, Development |
| Object Types | Filter files by type (comma-separated) | CSV,JSON,PARQUET or * for all |
| Secret Name | Reference to secret stores (AWS Secrets Manager, Azure Key Vault) | my-sftp-credentials |
Step 3: Test Connection
- After entering details, click Test Connection
- Ensure the connection is validated successfully
- If using SSH keys, verify the key format is correct (PEM format with proper line breaks)
- Authentication failed: Verify username and password/key are correct
- Host key verification: DataDios automatically accepts new host keys
- Permission denied: Ensure the user has read access to the specified folder path
- Key format error: Make sure the private key includes
-----BEGIN ... KEY-----and-----END ... KEY-----headers
Step 4: Save Data Source
- If the test succeeds, click Create to save the data source
- You will be redirected to the Datasource Listing Page, where the Remote Folder data source will appear
Step 5: Explore Data Source Items
-
Expand the Remote Folder data source to view all items (folders and files)
-
The file browser displays:
- Folders: Click to expand and view contents
- Files: Shows file name, type, and last modified date
-
To view metadata about any item:
- Click the item name
- Click the three stacked lines icon to open the Object Metadata pop-up
-
You can also explore additional features in the Metadata Explorer:
-
Object Data
- View the actual data present in the selected file (e.g., CSV rows, JSON content)
-
Attributes
- View column names and inferred data types for structured files
-
Supported File Formats
| Format | Extension | Features |
|---|---|---|
| CSV | .csv | Comma-separated values, auto-detect schema |
| TSV | .tsv | Tab-separated values |
| JSON | .json | JSON documents |
| Parquet | .parquet | Columnar storage format |
| ORC | .orc | Optimized Row Columnar format |
| Avro | .avro | Apache Avro data serialization |
| AVSC | .avsc | Avro schema files |
| XML | .xml | XML documents |
| Text | .txt, .text | Plain text files |
Connection Configuration Examples
Example 1: Password Authentication
{
"host": "192.168.1.100",
"username": "datauser",
"password": "SecurePassword123",
"folder_path": "/home/datauser/datasets",
"object_types": "CSV,PARQUET,JSON"
}
Example 2: SSH Private Key (RSA)
{
"host": "myserver.example.com",
"username": "ubuntu",
"pem_data": "-----BEGIN RSA PRIVATE KEY-----\nMIIEpAIBAAKCAQEA...\n-----END RSA PRIVATE KEY-----",
"folder_path": "/var/data/files",
"object_types": "*"
}
Example 3: SSH Private Key with Passphrase (Ed25519)
{
"host": "secure-server.example.com",
"username": "admin",
"pem_data": "-----BEGIN OPENSSH PRIVATE KEY-----\nb3BlbnNzaC1rZXktdjEAAAA...\n-----END OPENSSH PRIVATE KEY-----",
"passphrase": "my-key-passphrase",
"folder_path": "/data/analytics"
}
Best Practices
- Use SSH Keys instead of passwords for enhanced security
- Use Ed25519 Keys for modern, secure, and fast authentication
- Store Credentials in Secret Stores (AWS Secrets Manager, Azure Key Vault) to avoid hardcoding
- Always Test Connection before saving to ensure configuration is correct
- Use Specific Object Types to filter only relevant file types and improve performance
- Organize with Groups for easier management of multiple Remote Folder data sources
- Use Absolute Paths for folder_path to avoid ambiguity
Security Considerations
- All connections are made over SSH (port 22 by default), ensuring encrypted data transfer
- Private keys are stored securely and never exposed in logs
- Use encrypted private keys with passphrases for additional security
- Consider using bastion hosts or jump servers for accessing servers in private networks
Generating SSH Keys
Linux / macOS
Generate Ed25519 Key (Recommended)
ssh-keygen -t ed25519 -C "your-email@example.com"
Generate RSA Key
ssh-keygen -t rsa -b 4096 -C "your-email@example.com"
Copy Public Key to Server
ssh-copy-id username@remote-server
View Private Key (for pasting into DataDios)
cat ~/.ssh/id_ed25519
Windows
Option 1: Using PowerShell (Windows 10/11)
# Generate Ed25519 key
ssh-keygen -t ed25519 -C "your-email@example.com"
# Generate RSA key
ssh-keygen -t rsa -b 4096 -C "your-email@example.com"
Keys are saved to: C:\Users\YourUsername\.ssh\
Option 2: Using Git Bash
ssh-keygen -t ed25519 -C "your-email@example.com"
After Generating Keys
-
Copy public key to server - Add contents of
id_ed25519.pub(orid_rsa.pub) to server's~/.ssh/authorized_keys -
Use private key in DataDios - Copy contents of
id_ed25519(orid_rsa) and paste into PEM Data field
Example: View and copy private key
# Linux/macOS/Git Bash
cat ~/.ssh/id_ed25519
# Windows PowerShell
Get-Content $env:USERPROFILE\.ssh\id_ed25519
When prompted during key generation:
- File location: Press Enter to use default
- Passphrase: Optional but recommended for extra security
Troubleshooting
Common Issues
| Issue | Possible Cause | Solution |
|---|---|---|
| Connection timeout | Server unreachable or firewall blocking | Verify network connectivity and firewall rules |
| Authentication failed | Wrong credentials | Double-check username and password/key |
| Permission denied | User lacks read access | Ensure user has permissions on the folder |
| Key format error | Invalid PEM format | Verify key has proper headers and line breaks |
| File not found | Path doesn't exist | Verify folder_path exists on the remote server |
Key Format Tips
When pasting SSH private keys, ensure:
- The key includes
-----BEGIN ... KEY-----header - The key includes
-----END ... KEY-----footer - Line breaks are preserved (if pasting from web form, the system will auto-fix)
For more details on configuring data sources with secret stores, see the Secret Stores documentation.