- Soonha Hwang (soonhah2)
- Shu Xian Lai (sxlai2)
- Tyler Kim (tkim139)
- Akshay Sivaraman (abs7)
Whenever we surf in Amazon, Amazon always tempts us to purchase related products. For example, when I try to buy a shampoo, Amazon would show soaps and shower towels. When I try to buy a calculator, Amazon would suggest pens and pencils. The following image demonstrates Amazon's suggestions system when I searched for a monitor.
We were curious about three things, which we tried to find out by project:- What are the top-N items that render the most number of related purchases (IDDFS or Depth Limited Search)?
- Among the top-N products from step 1, what are the products that would be most profitable to promote on (Tarjan's Algorithm)?
- After filtering top-N products from the raw dataset, what are the total number of products we are left with? (Breadth First Search)
You can follow few easy steps to execute our project.
To begin with, you will have to load llvm on EWS environment with module load llvm
- Clone this repository
git clone https://github-dev.cs.illinois.edu/cs225-sp21/soonhah2-sxlai2-tkim139-abs7.git
- Locate in correct directory. For more information on our directory structure, please refer to Location of Code, Data, and Results
cd soonhah2-sxlai2-tkim139-abs7/Forest
Inside the Forest directory, run make to compile the project. Execute main to run the project.
make
./mainFollowing output will show on terminal. Input your own variables and see different results according to the different inputs!
Note that all output is stored into a txt file in the results directory and our code assumes all user inputs are strictly legal according to this README and the prompts.
For example,
Enter limit for IDDFS (less than or equal to 3): 2
From the following input files
----------------------
1- SNAP Amazon Dataset
2- sample testfile 1
3- sample testfile 2
4- sample testfile 3
5- sample testfile 4
----------------------
Choose input file: 1
Enter ranking N: 100
Do you want to execute Tarjan's algorithm? (yes or no): yes
Do you want to execute BFS? (yes or no): no
Doing so will output results for IDDFS and Tarjan's Algorithm but not BFS. Also, in the example, we have selected ../Amazon6061.txt for our input, but you can input any input data just by simply typing in the file path after Enter input file: command.
you can also run testcases with the following steps:
- run test.cpp
make test ./test - Test can be run by
./test, or can be run with./test "test names". test.cpp is located inForest/tests/./test "Checking IDDFS testcase 3" - Note for test suite: In our test suite, we created specific test cases for: Constructor (both Node and Forest class), File I/O, IDDFS, Tarjan's Algorithm (we call them SCC in our test cases description), BFS, and also some miscellaneous test cases that test some of the nested helper functions inside our main algorithms. For each algorithm & traversal, we created at least two different test cases to account for graphs of different nature. What each test case does should be fairly straightforward from the title of our test cases.
All our code, test suite and results are in the Forest directory. Our code is in the files "Forest.h", "Forest.cpp" and "main.cpp". Our test suite is in the Tests directory. Our results will be outputted into the "result_midpoint.txt" file (in the results directory). Our dataset is in the general directory - "Amazon0601.txt", while we have self made test cases from sample_testcase1.txt to sample_testcase4.txt in directory called /tests. In the /tests directory, you can also see the visualization of the sample input.
├── Forest
│ ├── Forest
│ ├── Forest.cpp
│ ├── Forest.h
│ ├── main.cpp
│ ├── results
│ │ └── result_midpoint.txt (stores result)
│ └── tests
│ ├── testcases.txt (4 self created test cases)
│ └── test.cpp
├── Amazon0601.txt (Main testcase)
├── README.md
├── Final Project Proposal.md
└── Team_Contract.md
Click to see our extra documents:
Special thanks to Rishabh for being our good TA!