CSV Connector
A Comma Separated Values (CSV) file is a simple text file that stores data in a tabular format, with each line (row) representing a record and each field (column) within that record separated by commas. CSV files are commonly used to store data, such as assets and inventory, that can be easily exported and imported into other programs, such as spreadsheets, databases, and data visualization tools.
Brinqa offers a CSV Connector to facilitate the ingestion of data from a CSV file. Assuming that your data is stored in a proprietary format that the existing connectors cannot accommodate, you have the option of exporting your data to CSV files, and then leveraging the CSV Connector to ingest your data into the Brinqa Platform for further processing. This integration empowers you to build a unified view of your attack surface and strengthen your cybersecurity posture.
This document details the information you must provide for the connector to assess your file and retrieve data. See create a data integration for step-by-step instructions on setting up the integration.
Connection settings
When setting up a data integration, select CSV Connector from the Connector drop-down. If you cannot find the connector in the drop-down, make sure that you have installed it first. You must provide the following information:
-
Server: To establish a secure tunnel between the Brinqa Platform and your machine, where the CSV file resides, you may need to install a Brinqa Agent on the machine and create a data server for it. If this is required, select the data server that you have created.
-
Data directory: Enter the fully qualified path to the data directory where the CSV files reside. Wildcards (
*
) are allowed in the path, e.g.,/feed/vendors/vendor_*.csv
or/feed/mobile/mobile_device*.*
.noteWhen multiple CSV files are targeted using a wildcard, the CSV Connector processes them based on their last modified time, from oldest to newest. This ensures that the most recent data takes precedence when there are duplicates.
-
Target: Specify the unified data model (UDM) to map your data to. The default selection is Record.
For example, if your CSV file contains inventory data, you can select Host or any other data model that extends Asset. If you keep the default, Brinqa Platform creates the source data model (SDM) but does not map it to any UDM.
-
Identifier fields: (Optional) Using a comma-separated list, specify the fields that can be used as identifiers. To ensure that there are no duplicates, it is crucial to declare the fields in the correct order.
-
Unique fields: (Optional) Enter a comma-separated list of fields that, when combined together, represent a unique row. This is useful when your CSV file contains duplicates.
-
Numeric fields: (Optional) Enter a comma-separated list of fields that contain numbers. The format is
field=type
wherefield
is the name of the field andtype
is the type of numbers, such as integer, double, or long. -
Boolean fields: (Optional) Enter a comma-separated list of fields that contain boolean values. Example usage: Active, Internet Facing.
-
Date fields: (Optional) Enter a comma-separated list of fields that contain dates or date and times in the Joda-Time format. The format is
field=format
wherefield
is the name of the field andformat
is the Joda-Time format used in the field. Please refer to the Joda-Time DateTimeFormat documentation for a full list of supported formats and patterns.For example, if you have a field named First found and it uses the
MM/dd/yyyy
format, and another field named Last Updated and it uses theMM-dd-yyyy
format, you should enterFirst found=MM/dd/yyyy,Last Updated=MM-dd-yyyy
.Common Joda-Time Formats:
-
yyyy-MM-dd
: Four-digit year, two-digit month, two-digit day (e.g., 2024-06-17). -
yyyy-MM-dd HH:mm:ss
: Full datetime with hours, minutes, and seconds (e.g., 2024-06-17 14:30:00). -
MM/dd/yyyy
: Two-digit month, two-digit day, four-digit year (e.g., 06/17/2024). -
dd-MMM-yyyy
: Two-digit day, short month name, four-digit year (e.g., 17-Jun-2024). -
yyyyMMdd
: Eight-digit date (e.g., 20240617).
The CSV Connector also supports date fields in Epoch time formats, both in milliseconds and seconds:
-
For milliseconds, use the format:
epochMillis
. -
For seconds, use the format:
epochSeconds
.Milliseconds:
-
epochMillis:
1609459200000
: Represents 01/01/2021 @ 12:00 AM (UTC) in milliseconds. -
epochMillis:
1609455600000
: Represents 12/31/2020 @ 11:00 PM (UTC) in milliseconds.
Seconds:
-
epochSeconds:
1609459200
: Represents 01/01/2021 @ 12:00 AM (UTC) in seconds. -
epochSeconds:
1609455600
: Represents 12/31/2020 @ 11:00 PM (UTC) in seconds.
-
-
-
Multi-value fields: (Optional) Enter a comma-separated list of fields containing multiple values. The format is
field=delimiter
wherefield
is the name of the field anddelimiter
is the delimiter used to separate the values.For example, if you have a field named Ratings and it contains "High: Medium: Low" as values; and another field named Compliance and it contains "FedRAMP; HIPAA; PCI" as values, you should enter
Ratings=:,Compliance=;
. -
Multi-row fields: (Optional) Enter a comma-separated list of fields that contain values spanning across multiple rows.
In the following example, the IP address for host "h.brinqa.net" is specified across two rows:
Hostname, IP address
h.brinqa.net, 10.0.0.0
h.brinqa.net, 192.168.1.1After the data has been processed in the Brinqa Platform, the
hostname
attribute displays "h.brinqa.net" and theipAddress
attribute contains ["10.0.0.0", "192.168.1.1"]. To prevent any duplicates in theipAddress
attribute, you must ensure that the IP address field contains unique values. -
File encoding: Specify the encoding of the CSV file. The default encoding is UTF-8.
-
Text qualifier: Specify the qualifier that determines the start and end of a field. The default qualifier is the double quote (
"
). -
Field delimiter: Specify the delimiter for the fields. The default delimiter is comma (
,
). -
EOL characters: Specify the character that represents the end of the line (EOL). The default character is CRLF.
-
Failure threshold: Specify the acceptable number of failed rows before processing of the CSV file is halted. A value of
-1
indicates that an infinite number of failures are acceptable. When the threshold is reached, the CSV Connector halts the ingestion process, allowing you to address any issues before resuming. -
Max age: Specify the maximum number of days to keep the CSV file in the Brinqa Platform. Any value less than 0 indicates that the file does not expire, and 0 indicates not to keep the file. The default is -1.
-
Max files: Specify the maximum number of CSV files to keep in the data directory after they have been processed. Any value less than 0 indicates that there is no limit on the number of files to retain, and 0 indicates not to keep any file. The default is -1.
-
Rename or move the file after it's processed: Select this option to rename or move the file after it has been imported into the Brinqa Platform.
tipIf you enable this option, after a CSV file has been ingested, the connector renames the file by appending
.processed
to the file name. This ensures that the same file won't be ingested multiple times in subsequent sync operations.
Types of data to retrieve
The CSV Connector retrieves records from CSV files and maps them to the data model you specify in the Target field.
The CSV Connector does not currently support operation options for the types of data it retrieves.
For detailed steps on how to view the data retrieved from the CSV Connector in the Brinqa Platform, see How to view your data.
Sample CSV file
Below is an example of a simple CSV file, compatible with the CSV Connector, that represents package information:
name,lastSeen,owners,vendor,versionName
Google Chrome,2024-01-15 08:00:00,John Doe;Jane Smith,Google,91.0.4472.124
Microsoft Office,2024-02-20 08:00:00,Alice Johnson;Bob Brown,Microsoft,16.0.13901.20462
Adobe Acrobat Reader,2024-03-05 08:00:00,Charlie Davis;Eve Adams,Adobe,2021.001.20145
Here is a table of the above CSV file information for simplified viewing:
name | lastSeen | owners | vendor | versionName |
---|---|---|---|---|
Google Chrome | 2024-01-15 08:00:00 | John Doe;Jane Smith | 91.0.4472.124 | |
Microsoft Office | 2024-02-20 08:00:00 | Alice Johnson;Bob Brown | Microsoft | 16.0.13901.20462 |
Adobe Acrobat Reader | 2024-03-05 08:00:00 | Charlie Davis;Eve Adams | Adobe | 2021.001.20145 |
In the above example, each row represents a record with the following attributes:
name
: The name of the package.lastSeen
: The date and time when the package was last seen, in Joda-Time formatyyyy-MM-dd HH:mm:ss
.owners
: A semicolon-separated list of owners for the package.vendor
: The vendor of the package.versionName
: The version name of the package.
If your CSV files contain more complex data, ensure each field is properly formatted. If you use multi-value fields, you must specify the delimiters for these fields in the connector settings.
APIs
As the CSV Connector is file-based, it doesn't rely on any API endpoints and thus, doesn't offer any operation options.
Changelog
The CSV Connector has undergone the following changes:
3.0.9
- Enhanced date time support by changing the data type of date attributes from Long to Instant.
3.0.8
- Updated dependencies.
3.0.7
- Updated dependencies.
3.0.6
- Added support for Epoch seconds and milliseconds.
3.0.5
- Fixed an issue where importing the same CSV file twice resulted in duplicated records.
3.0.4
- Stored the value 'false' in a Boolean field if the retrieved value is anything other than 'true'.
3.0.3
- Introduced configuration on a Boolean field to effectively handle boolean types.
3.0.2
- Removed the use of
ImmutableSet
.
3.0.1
- Initial Integration+ release.