gphanikumar · ArivoliR · Jul 31, 2024
diff --git a/arivoli/README.md b/arivoli/README.md
@@ -0,0 +1,8 @@
+# ID2090 Introduction to Scientific Computing
+
+## Solutions to Assignments Jan-May 2024
+
+Assignments solved using, python, shell scripts, sagemath, and octave code. Reports written in LaTex for the same.
+- Solutions are mostly submitted as .sh files (bash scripts), but the code inside utilizes the above mentioned langauges.
+- Each assignment folder has a assignment.pdf file which contains all the information about the questions.
+- From assignment 4 and onwards, there is a LaTex report present, named report.pdf which has a detailed explanation of the solution
diff --git a/arivoli/assignment_1/Assignment-1.pdf b/arivoli/assignment_1/Assignment-1.pdf
diff --git a/arivoli/assignment_1/README.md b/arivoli/assignment_1/README.md
@@ -0,0 +1,29 @@
+# Assignment 1
+
+## Question 1
+You are provided with a shell script binary 1. hunt.sh.x. Only execute permissions are given to the script.
+Upon running it, you’ll receive a unique `README` file and a unique `directory tree`. Navigate into the
+directory to find a `README` file. Follow the instructions given in your respective `README` file, follow the
+instructions at every stage and obtain the final key. Name your answer script as `question1.sh`, and
+attach the key in this shell script. Executing your script should only output the key you find. Feel free
+to use the man pages for commands as and when required. Happy hunting!
+
+## Question 2
+Recurrence relations are often encountered in modeling the dynamics of processes, analyzing algorithms, and generating sequences. 
+The Fibonacci sequence is the simplest and most famous recurrence relation.
+In this exercise, you are tasked with finding the n<sup>th</sup> term in a generalized recurrence relation given by:
+
+ `af[n] = bf[n − 1] + cf[n − 2] for a, b, c ∈ R and f : N → R`
+
+The coefficients a, b, c, and the first two values f[1] and f[2] will be passed (in order) as a file during input. 
+Your program is expected to take in t test cases and output the n<sub>t</sub>th term for every test case.
+Your program must check whether the correct input format is followed and throw an error if incorrect input is provided, also indicating the correct usage.
+
+## Question 3
+In this exercise, you are provided with a file that contains two parts. 
+The first 52 lines specify how each character from the alphabet is encoded in a number format [Aa-Zz], which is a function of the equivalent ASCII values of each of the alphabets. 
+The second part of the file (subsequent lines) contains encoded values for each username/roll number. 
+Your program must utilize the encodings to decode the encoded names for each username in the second part of the file and output the names. 
+Save the output of the decoded file onto `output.txt`. 
+
+Use `curl` on the [website](https://id2090assignment1.s3.ap-south-1.amazonaws.com/Q3.txt) to obtain the file.
diff --git a/arivoli/assignment_1/question_1/README.md b/arivoli/assignment_1/question_1/README.md
@@ -0,0 +1,25 @@
+# Question 1
+
+
+## Problem Statement
+You are provided with a shell script binary 1. hunt.sh.x. Only execute permissions are given to the script.
+Upon running it, you’ll receive a unique `README` file and a unique `directory tree`. Navigate into the
+directory to find a `README` file. Follow the instructions given in your respective `README` file, follow the
+instructions at every stage and obtain the final key. Name your answer script as `question1.sh`, and
+attach the key in this shell script. Executing your script should only output the key you find. Feel free
+to use the man pages for commands as and when required. Happy hunting!
+
+*question cannot be attempted without access to VM*
+
+### Note
+The hunt.sh.x file can be found in /var/home/Jan24/assignments/assignment_1
+
+## Solution
+
+**Commands Used:** `cd, ls, grep, find, file, mv, strings, xz, bzip2, gzip`
+
+### Usage
+
+```bash
+./question_1.sh
+``` 
diff --git a/arivoli/assignment_1/question_1/writeup.md b/arivoli/assignment_1/question_1/writeup.md
@@ -0,0 +1,45 @@
+After running hunt.sh.x
+In assignment1/question1 
+```
+ee23b008@ID2090:~/assignment_1/question_1$ cat README.md 
+# Welcome to Treasure Hunt.
+## First Challenge:
+## There is a Giratina inside.
+### If you dont know what Giratina is, it is a Legendary Ghost Pokemon.
+### Similarily, you will see a ghost file inside, which you can't see in your naked eyes.
+### But if you could access it, it would serve you forever.
+``` 
+
+Find this ghost file and open it
+```
+ee23b008@ID2090:~/assignment_1/question_1$ find -name '.*'
+.
+./0d/.gh6st
+
+ee23b008@ID2090:~/assignment_1/question_1/0d$ cat .gh6st 
+You successfully caught Giratina with the Pokeball. Here I am to serve you.
+ Go to the directory number 0 and file number 5.
+Remember: The next instruction is prepended with many ampersands in the file.
+
+ee23b008@ID2090:~/assignment_1/question_1/0d/0d$ strings 5 | grep '&&&&'
+&&&&dIIR#8	fiL#2	Mult1P ly	c0m preSsed	f1Le	@ he@d:Tr Y_Ur_LuCk_&_Dec 0mpreSS_iT	k
+```
+
+Now you'll find a file thats been compressed around 10 times. 
+Use file command to figure out which, rename the file to the correspodining suffix 
+(5.txt shld be renamed to 5.gz so that you can used gzip decompress)
+
+final output:
+```
+00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
+00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
+
+You have successfully completed the Treasure Hunt. Here is your reward.
+
+---- BEGIN PRIVATE KEY ----
+a9d4ef45656fceee7571acc6037a4841526f81e6f0dbff1c093cd21c33e3a9f8d6c72cbf93a33c23c0504fb639ca96474286f3bb3158b9112fa667ac8c717b46
+---- END PRIVATE KEY ----
+00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
+00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
+```
+
diff --git a/arivoli/assignment_1/question_2/README.md b/arivoli/assignment_1/question_2/README.md
@@ -0,0 +1,32 @@
+# Question 2
+
+## Problem Statement: 
+
+Recurrence relations are often encountered in modeling the dynamics of processes, analyzing algorithms, and generating sequences. 
+The Fibonacci sequence is the simplest and most famous recurrence relation.
+In this exercise, you are tasked with finding the n<sup>th</sup> term in a generalized recurrence relation given by:
+
+ `af[n] = bf[n − 1] + cf[n − 2] for a, b, c ∈ R and f : N → R`
+
+The coefficients a, b, c, and the first two values f[1] and f[2] will be passed (in order) as a file during input. 
+Your program is expected to take in t test cases and output the n<sub>t</sub>th term for every test case.
+Your program must check whether the correct input format is followed and throw an error if incorrect input is provided, also indicating the correct usage.
+
+## Usage: 
+```
+./question_2.sh initial.txt testcases.txt
+```
+
+## Input: 
+`initial.txt`\
+1, 1, 1, 1, 1\
+`testcases.txt`\
+3\
+14\
+27\
+18
+
+## Output:
+377\
+196418\
+2584
diff --git a/arivoli/assignment_1/question_2/initial.txt b/arivoli/assignment_1/question_2/initial.txt
@@ -0,0 +1 @@
+0.4231, -1.6553, -3.5874, 3.4877, 6.5455
diff --git a/arivoli/assignment_1/question_2/question_2.sh b/arivoli/assignment_1/question_2/question_2.sh
@@ -0,0 +1,77 @@
+#!/usr/bin/bash
+<<COMMENT
+Name: Arivoli Ramamoorthy
+Roll number: EE23B008
+Date: 02-03-2024
+Description: 
+-Find the n-th term in a generalized recurrence relation.
+-Recurrence Relation: af[n] = bf[n - 1] + cf[n - 2].  
+Input: Coefficients a, b, c, and initial values f[1] and f[2] are provided via a file.  
+Output: Computes and outputs the n-th term for each test case. 
+COMMENT
+
+#Check if there are 2 command line arguments
+if [ "$#" -ne 2 ]; then
+	echo "Usage: $0 <file1> <file2>"
+	exit 1
+fi
+
+# Function to calculate the nth term in the recurrence relation
+calculate_nth_term() {
+	local n=$1
+	local a=$2
+	local b=$3
+	local c=$4
+	local f1=$5
+	local f2=$6
+	# Base cases
+	if [ "$n" -eq 1 ]; then
+		echo $f1
+		return
+	elif [ "$n" -eq 2 ]; then
+		echo $f2
+		return
+	fi
+
+	local prev=($f1 $f2)
+	local current=0
+
+	for ((i = 3; i <= n; i++)); do
+		current=$(bc <<<"scale=4; ($b * ${prev[1]} + $c * ${prev[0]}) / $a")
+		prev[0]=${prev[1]}
+		prev[1]=$current
+	done
+
+	echo $current
+}
+
+# Reading coefficients from inital.txt
+read -r a b c f1 f2 <<<"$(cat "$1" | tr -d ',')"
+#echo "Coefficients: a=$a, b=$b, c=$c, f1=$f1, f2=$f2"
+if [[ -z $a || -z $b || -z $c || -z $f1 || -z $f2 ]]; then
+	echo "Error: All coefficients (a, b, c, f1, f2) must be provided in $1."
+	exit 1
+fi
+
+IFS= read -r num_testcases <"$2"
+#echo "Number of test cases: $num_testcases"
+
+display_correct_usage() {
+	echo "Correct usage: <number of test cases> <test case 1> <test case 2> ..."
+}
+
+actual_num_testcases=$(tail -n +2 "$2" | wc -l)
+
+if [[ $num_testcases -ne $actual_num_testcases ]]; then
+	echo "Error: Number of test cases provided ($actual_num_testcases) does not match the specified number ($num_testcases)."
+	display_correct_usage
+	exit 1
+fi
+
+for ((i = 0; i < actual_num_testcases; i++)); do
+	IFS= read -r n
+	result=$(calculate_nth_term "$n" "$a" "$b" "$c" "$f1" "$f2")
+	#echo "For n=$n, nth term: $result"
+	echo "$result"
+done < <(tail -n +2 "$2")
+
diff --git a/arivoli/assignment_1/question_2/testcases.txt b/arivoli/assignment_1/question_2/testcases.txt
@@ -0,0 +1,6 @@
+5
+4
+17
+10
+15
+8
diff --git a/arivoli/assignment_1/question_3/README.md b/arivoli/assignment_1/question_3/README.md
@@ -0,0 +1,32 @@
+# Question 3
+
+## Problem Statement: 
+In this exercise, you are provided with a file that contains two parts. The first 52 lines specify how each 3.
+character from the alphabet is encoded in a number format [Aa-Zz], which is a function of the equivalent
+ASCII values of each of the alphabets. The second part of the file (subsequent lines) contains encoded
+values for each username/roll number. Your program must utilize the encodings to decode the encoded
+names for each username in the second part of the file and output the names. Save the output of the
+decoded file onto `output.txt`\
+Use `curl` on the [website](https://id2090assignment1.s3.ap-south-1.amazonaws.com/Q3.txt) to obtain the file.
+
+## Usage: 
+```
+./question_3.sh <the above URL>
+```
+
+## Output:
+```
+Username,Password
+ae23b005,bhavesh
+ae23b010,guhaan
+..
+<and so on>
+```
+
+### Brownie Points: 
+For not hardcoding all the encodings and identifying the underlying function used
+on the ASCII values of each character. Also, mention the approach you took to guess the function.
+
+### Hint:
+Try looking up on `gnuplot` and plot the encodings for subsequent ASCII values (alphabets) and
+guess the function by its shape.
diff --git a/arivoli/assignment_1/question_3/question_3.sh b/arivoli/assignment_1/question_3/question_3.sh
@@ -0,0 +1,46 @@
+#!/usr/bin/bash
+<<COMMENT
+Name: Arivoli Ramamoorthy
+Roll number: EE23B008
+Date: 02-03-2024
+Description: 
+-The program decodes encoded names using provided encoding rules.
+Input: Encoded values for usernames/roll numbers.
+Output: Roll number, Decoded names saved in "output.txt".
+COMMENT
+
+
+#Tried dividing the encoded number by the ASCII of the corresponding alphabet. Found a pattern
+#A=65*202; B=66*204; C=67*206.....
+#for increment by 1 in ascii, the multiplying factor increased by two.
+#so figured that formula was x*(2x+c) and by substitution found that c = 72.
+#A = 65 * (2*65 + 72) where 65 is the ascii value of A.
+#didnt bother plotting on gnuplot but it should give a parabalo as the equation is quadratic
+
+# Fetch the file using curl
+curl -s "$1" >temp_file.txt
+
+decode_username() {
+	# Ignore the first element (roll number)
+	encode="$@"
+	roll_num=$(awk '{print $1}' <<<"$encode")
+	encoded=${encode#* }
+	decoded=""
+	for x in $encoded; do
+		# Applying transformation ((x-72)/2)^0.5
+		ascii_value=$(echo "if (d=72*72-4*2*(-$x)) { x1=(-72+sqrt(d))/(2*2); x2=(-72-sqrt(d))/(2*2); if (x1>=0) x1; if (x2>=0) x2; }" | bc)
+		# Converting the ascii value to character
+		decoded+="$(printf \\$(printf '%03o' $ascii_value))"
+	done
+	echo "$roll_num","$decoded"
+}
+
+# Decode usernames from the remaining lines
+tail -n +53 temp_file.txt | while read -r line; do
+	username=$(decode_username "$line")
+	echo "$username"
+done >output.txt
+
+# Clean up temporary file
+rm temp_file.txt
+
diff --git a/arivoli/assignment_2/Assignment-2.pdf b/arivoli/assignment_2/Assignment-2.pdf
diff --git a/arivoli/assignment_2/README.md b/arivoli/assignment_2/README.md
@@ -0,0 +1,58 @@
+# Assignment 2
+
+## Question 1 
+
+Web scraping is the process of extracting data from a website or any online source. In this era of Large
+Language Models, web scraping has become commonplace for gathering large quantities of data. Often,
+the data that is gathered is unusable and requires pre-processing. In this task, you are required to fetch
+data from an online source and perform some basic manipulations to prepare the data.
+
+NASA maintains an [archive](https://apod.nasa.gov/apod/archivepixFull.html) of photographs captured by various enthusiasts, along with a brief explanation written by a professional astronomer. You are tasked to create a list of titles of these images that
+were uploaded on special dates (DD/MM/YYYY) like
+(a) dates whose YYYY is divisible by DD,
+(b) dates whose YYYY is divisible by MM.
+
+
+## Question 2 
+
+Publicly available datasets are often riddled with errors. In most cases, data visualization reveals such 2.
+inconsistencies. In this task, you are provided with a dataset of an EV manufacturer that contains
+multiple parameters. The parameters are mentioned in the header of the dataset.
+– Upon inspection, it turns out that all the alphabets in the data have been mistakenly replaced with
+their complement (where the complement of the ith letter of the alphabet is 27 − ith letter with the
+case retained).
+– Also, on keen observation, the SoH and SoC columns are interchanged for `Vehicle Number` AG.
+– (Misreported entries) In addition to these errors, there are also obvious entries where the reported
+`mileage` is non-zero despite SoC = 0.
+– There are also rows in the dataset where certain parameters are missing. Since that those rows are
+useless, you may remove them.
+You are tasked with correcting these errors to produce a clean dataset and also Flag misreported entries
+as “fake”. The dataset is located at `/var/home/Jan24/assignments/assignment_2`.
+
+
+## Question 3
+
+You are given a .csv in which each row is considered as a document (d) and the rows constitute the
+collection of documents (D). Assume that only periods (‘.’) and commas (‘,’) are only punctuations
+present in the documents. <br />
+(a) Given a term t, return its TF-IDF index (accurate to 4 decimal places). <br />
+(b) If no arguments are passed when calling question_3.sh, return the top-5 terms (with values) in
+decreasing order of TF-IDF index.
+
+## Question 4
+
+Structured Query Language (SQL) is extensively used to manage databases and is designed to query 4.
+data in relational databases. In this exercise, you are tasked to replicate one of SQL’s fundamental
+features JOIN using (preferably) awk or a combination of join, sort and sed (and other commands as
+needed).<br />
+The JOIN clause is used to combine rows from two (or more) tables based on some relation common
+between them. SQL offers four types of JOINs (Fig. 1), namely<br />
+– INNER JOIN: Returns records that have matching values in both tables,<br />
+– LEFT JOIN: Returns all records from the left table, and the matched records from the right table,<br />
+– RIGHT JOIN: Returns all records from the right table, and the matched records from the left table,<br />
+– FULL (OUTER) JOIN: Returns all records when there is a match in either left or right table<br />
+
+(a) Write a bash script with flags (‘-I’ for INNER JOIN, ‘-L’ for LEFT JOIN, ‘-R’ for RIGHT JOIN and
+‘-F’ for FULL JOIN) to parse two .csv files (with fixed columns) and output the joined .csv file <br />
+(b) Extend sub-part (a) to adapt for generic csv files (no restriction on number of columns). You may
+assume that the columns names across the two files will be identical.
diff --git a/arivoli/assignment_2/question_1/README.md b/arivoli/assignment_2/question_1/README.md
@@ -0,0 +1,17 @@
+# Question 1 
+
+Web scraping is the process of extracting data from a website or any online source. In this era of Large
+Language Models, web scraping has become commonplace for gathering large quantities of data. Often,
+the data that is gathered is unusable and requires pre-processing. In this task, you are required to fetch
+data from an online source and perform some basic manipulations to prepare the data.
+
+NASA maintains an [archive](https://apod.nasa.gov/apod/archivepixFull.html) of photographs captured by various enthusiasts, along with a brief explanation written by a professional astronomer. You are tasked to create a list of titles of these images that
+were uploaded on special dates (DD/MM/YYYY) like
+(a) dates whose YYYY is divisible by DD,
+(b) dates whose YYYY is divisible by MM.
+
+## Usage:
+./question_1.sh
+
+## Output:
+Two `.csv` files − `answer_1a.csv` and `answer_1b.csv` for the corresponding parts.