Skip to content
This repository was archived by the owner on May 15, 2019. It is now read-only.

Install and Configure ML

NathanSegerlind edited this page Sep 8, 2016 · 14 revisions

On the edge server, recursively clone the oni-ml repository.

[soluser@edge-server]$ git clone --recursive https://github.com/Open-Network-Insight/oni-ml.git
cd oni-ml/oni-lda-c/
make clean

Edit machinefile. This file tells the MPI engine how many workers will be created and on which host.

[soluser@edge-server]$ vim /oni-lda-c/machinefile

Modify the machine file to contain the exact same nodes that you used for the NODES environment variable, along with the number of workers. The file will have the same format (watch the ‘:’):

[soluser@edge-server]$ cat /oni-lda-c/machinefile 
Host1:5
Host2:5
Host3:5
Host4:5 

EDIT MPI VARIABLES IN /etc/duxbay.conf In the configuration file /etc/duxbay.conf set the following variables

MPI_CMD  - the command line to invoke MPI execution, ie. mpiexec
PROCESS_COUNT - the total number of processes across all MPI nodes
MPI_PREP_CMD - Optional sourcing of variable or similar environment wrangling to invoke MPI. Can be empty.

The build step

Build the oni-ml jar for Spark.

[soluser@edge-server]$ cd oni-ml
[soluser@edge-server]$ sbt assembly

Build the oni-lda-c code.

[soluser@edge-server]$ cd oni-lda-c/
[soluser@edge-server]$ make

Copy the entire oni-ml folder to the primary ML node, the node that will launch the mpiexec command and do the local processing. For simplicity, this is often the lowest-numbered node in this role. Log in to the primary ML node.

[soluser@edge-server]$ scp –r ml node-04:/home/"soluser"/.
[soluser@edge-server]$ ssh node-04
[soluser@node-04]$cd /home/"soluser"/ml

The completed and configured ML pipeline needs to be copied to all the nodes. The script install_ml.sh does this with the help of the NODES variable

[soluser@node-04]$ ./install_ml.sh 
  • Home
  • [Overview of Open Network Insight](Overview of Open Network Insight)
    • [Technical Overview](Technical Overview)
  • [Planning Guide](Planning Guide)
    • [Deployment Option 1: Pure Hadoop](Pure Hadoop)
    • [Deployment Option 2: Hybrid Hadoop / Virtual](Hybrid Hadoop)
  • [Deployment Guide](Deployment Guide)
  • [Installation & Configuration Guides](Installation & Configuration Guides)
  • [User Guide](User Guide)
    • Flows
      • [Suspicious Connects – Analyst View](Suspicious Connects)
      • [Threat Investigation – Analyst View](Threat Investigation)
      • Storyboard
      • [Ingest Summary – Analyst View](Ingest Summary)
    • DNS
      • [Suspicious DNS – Analyst View](Suspicious DNS)
      • [Threat Investigation – Analyst View](DNS Threat Investigation)
      • [Storyboard](DNS Storyboard)
    • Proxy
      • [Suspicious Proxy - Analyst View](Suspicious Proxy)
      • [Threat Investigation - Analyst View](Proxy Threat Investigation)
      • [Storyboard](Proxy Storyboard)
  • ONI Demo

Clone this wiki locally