Skip to content
This repository was archived by the owner on May 15, 2019. It is now read-only.
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
# **Open Network Insight**
ONI Operational Analytics (OA) is a collection of modules, which includes both the data processing and transformation as well as the GUI module for data visualization.

The visualization repository contains all the front-end code and files related to the Open Network Insight visual elements, such as styles, pages, data files, etc.
The visualization repository (UI folder) contains all the front-end code and files related to the Open Network Insight visual elements, such as styles, pages, data files, etc.
Some of the technologies used are:

- [IPython==3.2.1](https://ipython.org/ipython-doc/3/index.html)
Expand Down
18 changes: 10 additions & 8 deletions oa/INSTALL.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,13 @@
ONI Operational Analytics (OA) is a set of python modules and utilities with routines to extract and transform data, loading the results into output files.
OA represents the last step before users can score connections and analyze data in the UI.

OA scripts are very similar for the different data types supported however the code is divided into 3
main modules due to differences on the data model and what context information is required for each data type.

The three supported data types are Flow, DNS and Proxy. For more information about the type of information and insights
that can be found for each data source please visit ONI [wiki](https://github.com/Open-Network-Insight/open-network-insight/wiki).

OA scripts are very similar for the different data types supported however the code is divided into 3
main modules due to differences in the data model and what context information is required for each data type.


## Folder Structure

components -> Set of utilities prepared to provide context to raw data and
Expand Down Expand Up @@ -39,7 +40,7 @@ In order to execute this process there are a few prerequisites:
2. Components configuration. To find about how to configure each of the extra components included in this project
visit oa/components/[README.md](https://github.com/Open-Network-Insight/oni-oa/tree/1.1/oa/components).
These components are required to add context or extract additional information that is going to complement your
original data. Each of this components are independent from each other. Based on the data type some components are
original data. Each of these components are independent from each other. Based on the data type some components are
required or not.
3. oni-ml results. Operational Analytics works and transforms Machine Learning results. The implementation of Machine Learning
in this project is through [oni-ml](https://github.com/Open-Network-Insight/oni-ml). Although the Operational Analytics
Expand All @@ -54,7 +55,7 @@ In order to execute this process there are a few prerequisites:
##Operational Analytics installation and usage
####Installation

OA installation consists on the configuration of extra modules or components and creation of a set of files.
OA installation consists of the configuration of extra modules or components and creation of a set of files.
Depending on the data type that is going to be processed some components are required and other components are not.
If users are planning to analyze the three data types supported (Flow, DNS and Proxy) then all components should be configured.

Expand All @@ -79,9 +80,9 @@ In order to execute this process there are a few prerequisites:
10.192.1.1, MySystem


3. oni-setup project contains scripts to install hive database but also includes the main configuration file for this tool.
That file is called duxbay.conf which contains different variables that the user can set up to customize their installation, in fact, some
of them are required to be updated in order to have oni-ml and oni-oa working.
3. The oni-setup project contains scripts to install the hive database and also includes the main configuration file for this tool.
The main file is called duxbay.conf and it which contains different variables that the user can set up to customize their installation. Some variables are
must be updated in order to have oni-ml and oni-oa working.

To run the OA process it's required to install oni-setup. If it's already installed just make sure the following configuration are set up in duxbay.conf file.

Expand Down Expand Up @@ -118,6 +119,7 @@ In order to execute this process there are a few prerequisites:
OA process for the corresponding data type.
-l Data limit. Usually ML results contains thousands of records. With "Data limit" OA will process top K results.

The execution time of OA varies based on the number of records being processed and the data type.
Depending on the number of records being processed and the data type, OA can take long or short time to execute.
When the process completes you can go to oni-oa/data/\<data type> folder and check the results.

Expand Down
19 changes: 10 additions & 9 deletions oa/components/README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# COMPONENTS
# Operational Analytics Components

This document will explain the necessary steps to configure the oni-oa components.

Expand Down Expand Up @@ -27,7 +27,7 @@ This document will explain the necessary steps to configure the oni-oa component


###Data
Data source module
_Data source module._

This module needs to be configured correctly to avoid errors during the oni-oa execution. Here you need to select the correct database engine to obtain the correct results while creating additional details files.
Currently oni-oa is prepared to work with Impala, but you can always configure any other database engine and make the corresponding updates in the code.
Expand All @@ -46,9 +46,9 @@ You need to update the _engine.json_ file accordingly:
}

Where:
- database engine: Whichever database engine you have installed and configured in your cluster to work with ONI. i.e. "Impala" or "Hive".
- <database engine>: Whichever database engine you have installed and configured in your cluster to work with ONI. i.e. "Impala" or "Hive".
For this key, the value you enter needs to match exactly with one of the following keys, where you'll need to add the corresponding node name.
- node: The node name in your cluster where you have the database service running.
- <node>: The node name in your cluster where you have the database service running.

Example:

Expand All @@ -62,7 +62,7 @@ Example:


###Reputation
Reputation check module.
_Reputation check module._

This module is called during oni-oa execution to check the reputation for any given IP, DNS name or URI (depending on the pipeline). The reputation module makes use of two third-party services, McAfee GTI and Facebook ThreatExchange.
Each of these services are represented by a sub-module in this project, McAfee GTI is implemented by sub-module gti and Facebook ThreatExchange by sub-module fb. For more information see Folder Structure section.
Expand All @@ -77,6 +77,7 @@ Each of these services are represented by a sub-module in this project, McAfee G
**Enable/Disable GTI service**

It's possible to disable any of the reputation services mentioned above, all it takes is to remove the configuration for the undesired service in gti_config.json. To learn more about it, see the section below.
To add a different reputation service, you can read all about it [here](https://github.com/Open-Network-Insight/oni-oa/tree/1.1/oa/components/reputation)

**Configuration**

Expand Down Expand Up @@ -124,7 +125,7 @@ It's possible to disable any of the reputation services mentioned above, all it
- app_secret: App secret to connect to ThreatExchange service.

###IANA
Internet Assigned Numbers Authority codes translation module.
_Internet Assigned Numbers Authority codes translation module._

**Configuration**

Expand All @@ -146,7 +147,7 @@ default location, your configuration file should look like this:


###Network Context (nc)
Network Context module.
_Network Context module._

**Pre-requisites**

Expand Down Expand Up @@ -175,11 +176,11 @@ configuration file should look like this:


###Geoloc
Geolocation module.
_Geolocation module._

This is an optional functionality you can enable / disable depending on your preferences.

**Pre-requisites**
**Pre-requisites**
To start using this module, you need to include a comma separated file containing the geolocation for most (or all) IPs.
To learn more about the expected schema for this file or where to find a full geolocation db, please refer
to the [_context_](https://github.com/Open-Network-Insight/oni-oa/blob/1.1/oa/context/README.md) documentation
Expand Down
27 changes: 9 additions & 18 deletions oa/components/reputation/README.md
Original file line number Diff line number Diff line change
@@ -1,31 +1,22 @@
###GTI (gti)
DNS Global Threat Intelligence module.

This module is called in dns_oa.py for IP reputation check. The GTI module makes use of two third-party services, McAfee GTI and Facebook ThreatExchange. Each of these services are represented by a sub-module in this project, McAfee GTI is implemented by sub-module gti and Facebook ThreatExchange by sub-module fb. For more information see [Folder Structure](https://github.com/Open-Network-Insight/oni-oa/blob/1.0.1-dns_oa_readme_creation/ipython/dns/README.md#folder-structure).

## How to implement a new reputation service for DNS OA

DNS GTI comes with two sub-modules and they correspond to the reputation services we are supporting by default.
- gti: implements logic to call and return results from McAfee reputation service.
- fb: implements logic to call and return results from facebook ThreatExchange reputation service.
###Reputation
This section describes the functionality of the current reputation service modules and how you can implement your own.

It's possible to add new reputation services by implementing a new sub-module, to do that developers should follow
these steps:

1. Map the responses of the new reputation service with DNS reputation table.
1. Map the responses of the new reputation service, according to this reputation table.

| Key | Value |
|---|---|
| Key | Value |
|-----|-------|
|UNVERIFIED|-1|
|NONE |0 |
|LOW |1 |
|MEDIUM |2 |
|HIGH |3 |

2. Add a new configuration for the new reputation service in gti_config.json.
2. Add a new key for the new reputation service in gti_config.json.

{
"targe_columns" : [3],
{
"gti" : { …
},
"fb" : {…
Expand All @@ -36,11 +27,11 @@ DNS GTI comes with two sub-modules and they correspond to the reputation service
}
3. Create file structure for new sub-module.

[solution-user@edge-server]$ cd ~/ipython/dns/gti/
[solution-user@edge-server]$ cd ~/oni-oa/components/reputation/
[solution-user@edge-server]$ mkdir mynewreputationservice
[solution-user@edge-server]$ cd mynewreputationservice

4. Add _ _init_ _.py file.
4. Create an empty _ _init_ _.py file.
5. Add a new file *reputation.py*. Each sub-module should contain a reputation.py file.
6. Write your code in reputation.py. The code should contain the follow structure:

Expand Down
26 changes: 12 additions & 14 deletions oa/dns/README.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,12 @@
# DNS

oni-oa sub-module for Open-Network-Insight, version 1.1

DNS sub-module will extract and transform DNS (Domain Name Service) data already ranked by oni-ml and will load into csv files for presentation layer.
DNS sub-module extracts and transforms DNS (Domain Name Service) data already ranked by oni-ml and will load into csv files for presentation layer.

## DNS Components

###dns_oa.py

DNS oni-oa main script.

It executes the following steps:
DNS oni-oa main script executes the following steps:


1. Creates the right folder structure to store the data and the ipython notebooks. This is:
Expand Down Expand Up @@ -45,30 +41,32 @@ It executes the following steps:

**Dependencies**

Before running DNS OA users need to configure components for the first time. It is important to mention that configuring these components make them work for other data sources as Flow and Proxy.

- python 2.7. [Python 2.7](https://www.python.org/download/releases/2.7/) should be installed in the node running Proxy OA.
- [Python 2.7](https://www.python.org/download/releases/2.7/) should be installed in the node running Proxy OA.

The following modules are already included but some of them require configuration. See the following sections for more information.
- [components/iana](https://github.com/Open-Network-Insight/oni-oa/blob/1.1/oa/proxy#IANA-iana)
- [components/data](https://github.com/Open-Network-Insight/oni-oa/blob/1.1/oa/proxy#data)
- [components/nc](https://github.com/Open-Network-Insight/oni-oa/blob/1.1/oa/proxy#Network-Context-nc)
- [components/reputation](https://github.com/Open-Network-Insight/oni-oa/blob/1.1/oa/components#Reputation)
- [components/iana](https://github.com/Open-Network-Insight/oni-oa/blob/1.1/oa/components#IANA-iana)
- [components/data](https://github.com/Open-Network-Insight/oni-oa/blob/1.1/oa/components#data)
- [components/nc](https://github.com/Open-Network-Insight/oni-oa/blob/1.1/oa/components#network-context-nc)
- [components/reputation](https://github.com/Open-Network-Insight/oni-oa/blob/1.1/oa/components/reputation)
- dns_conf.json



**Prerequisites**

Before running DNS OA users need to configure components for the first time. It is important to mention that configuring these components make them work for other data sources as Flow and Proxy.

- Configure database engine
- Configure GTI services
- Configure IANA service
- Configure Network Context service
- Configure Geolocation
- Generate ML results for DNS


**Output**

- dns_scores.csv: Main results file for DNS OA. This file will contain suspicious connects information and it's limited to the number of rows the user selected when running [oa/start_oa.py](https://github.com/Open-Network-Insight/oni-oa/tree/1.1/oa).
- dns_scores.csv: Main results file for DNS OA. This file will contain suspicious connects information and it's limited to the number of rows the user selected when running [oa/start_oa.py](https://github.com/Open-Network-Insight/oni-oa/blob/1.1/oa/INSTALL.md#usage).

Schema with zero-indexed columns:

Expand Down
10 changes: 5 additions & 5 deletions oa/dns/ipynb_templates/EdgeNotebook.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
#DNS Edge Investigation Notebook

###Dependencies
- iPython == 3.2.1 [check documentation](https://ipython.org/ipython-doc/3/index.html)
- Python 2.7.6
- ipywidgets
- [iPython == 3.2.1](https://ipython.org/ipython-doc/3/index.html)
- [Python 2.7.6](https://www.python.org/download/releases/2.7.6/)
- [ipywidgets 5.1.1](https://ipywidgets.readthedocs.io/en/latest/user_install.html#with-pip)

The following python modules will be imported for the notebook to work correctly:

Expand All @@ -20,8 +20,8 @@ The following python modules will be imported for the notebook to work correctly

###Pre-requisites
- Execution of the oni-oa process for DNS
- Correct setup the duxbay.conf file. You can check this [link](https://github.com/Open-Network-Insight/open-network-insight/wiki/Edit%20Solution%20Configuration)
- Have a public key authentication between the current UI node and the ML node. You can follow this [instructions](https://github.com/Open-Network-Insight/open-network-insight/wiki/Configure%20User%20Accounts#configure-user-accounts)
- Correct setup the duxbay.conf file. [Read more](https://github.com/Open-Network-Insight/open-network-insight/wiki/Edit%20Solution%20Configuration)
- Have a public key authentication between the current UI node and the ML node. [Read more](https://github.com/Open-Network-Insight/open-network-insight/wiki/Configure%20User%20Accounts#configure-user-accounts)


##Data source
Expand Down
8 changes: 4 additions & 4 deletions oa/dns/ipynb_templates/ThreatInvestigation.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
#DNS Threat Investigation Notebook

###Dependencies
- iPython == 3.2.1 [check documentation](https://ipython.org/ipython-doc/3/index.html)
- Python 2.7.6
- ipywidgets
- [iPython == 3.2.1](https://ipython.org/ipython-doc/3/index.html)
- [Python 2.7.6](https://www.python.org/download/releases/2.7.6/)
- [ipywidgets 5.1.1](https://ipywidgets.readthedocs.io/en/latest/user_install.html#with-pip)

The following python modules will have to be imported for the notebook to work correctly:

Expand All @@ -22,7 +22,7 @@ The following python modules will have to be imported for the notebook to work c
##Pre-requisites
- Execution of the oni-oa process for DNS
- Score a set connections in the Edge Investigation Notebook
- Correct setup of the duxbay.conf file. You can check this [link](https://github.com/Open-Network-Insight/open-network-insight/wiki/Edit%20Solution%20Configuration)
- Correct setup of the duxbay.conf file. [Read more](https://github.com/Open-Network-Insight/open-network-insight/wiki/Edit%20Solution%20Configuration)


##Additional Configuration
Expand Down
35 changes: 17 additions & 18 deletions oa/flow/README.md
Original file line number Diff line number Diff line change
@@ -1,15 +1,11 @@
# **Flow OA**

oni-oa sub-module for Open Network Insight, version 1.1

Flow sub-module will extract and transform Flow data already ranked by oni-ml and will load into csv files for presentation layer.

Flow sub-module extracts and transforms Flow data already ranked by oni-ml and will load into csv files for presentation layer.

## **Flow OA Components**

### flow_oa.py
Flow oni-oa main script

It executes the following steps:
Flow oni-oa main script executes the following steps:

1. Creates required folder structure if does not exist for output files. This is:

Expand All @@ -26,12 +22,13 @@ It executes the following steps:

**Dependencies**

- python 2.7. [Python 2.7](https://www.python.org/download/releases/2.7/) should be installed in the node running Flow OA.
- [Python 2.7](https://www.python.org/download/releases/2.7/) should be installed in the node running Flow OA.

The following files and modules are already included but some of them require configuration. See the following sections for more information:
- [components/nc](https://github.com/Open-Network-Insight/oni-oa/tree/1.1/oa/components)
- [components/geoloc](https://github.com/Open-Network-Insight/oni-oa/tree/1.1/oa/components)
- [components/reputation](https://github.com/Open-Network-Insight/oni-oa/tree/1.1/oa/components)
- [components/iana](https://github.com/Open-Network-Insight/oni-oa/blob/1.1/oa/components#IANA-iana)
- [components/data](https://github.com/Open-Network-Insight/oni-oa/blob/1.1/oa/components#data)
- [components/nc](https://github.com/Open-Network-Insight/oni-oa/blob/1.1/oa/components#network-context-nc)
- [components/reputation](https://github.com/Open-Network-Insight/oni-oa/blob/1.1/oa/components/reputation)
- flow_config.json

The following files are not included:
Expand All @@ -40,16 +37,18 @@ The following files are not included:

**Prerequisites**

Before running Flow OA users need to configure components for the first time. It is important to mention that configuring these components make them work for other data sources as DNS and Proxy.
- Configure reputation module components/reputation
- Configure network context module components/nc
- Configure geo localization module components/geo
- Create iploc.csv file context/iploc.csv
- Generate ML results for Flow
Before running Flow OA users need to configure components for the first time. It is important to mention that configuring these components make them work for other data sources as DNS and Proxy.

- Configure database engine
- Configure GTI services
- Configure IANA service
- Configure Network Context service
- Configure Geolocation service
- Generate ML results for Flow

**Output**

- flow_scores.csv. Main results file for Flow OA. This file will contain suspicious connects information and it's limited to the number of rows the user selected when running [oa/start_oa.py](https://github.com/Open-Network-Insight/oni-oa/tree/1.1/oa).
- flow_scores.csv. Main results file for Flow OA. This file will contain suspicious connects information and it's limited to the number of rows the user selected when running [oa/start_oa.py](https://github.com/Open-Network-Insight/oni-oa/blob/1.1/oa/INSTALL.md#usage).

Schema with zero-indexed columns:
0. sev: int
Expand Down
Loading