TDM 20100: Project 2 - Introduction to Bash
Project Objectives
This project introduces you to some of the most useful UNIX tools, helps you navigate the filesystem, and enables you to run UNIX commands directly from within your Jupyter notebook.
Ways to run bash
There are three different ways to run bash code in Anvil.
1. In a terminal
2. Using !
3. Magic cell
Terminal
Go to File > New > Terminal
This should pop out a terminal in a new window.
This terminal approach allows for man lookups (man command won’t work using the other two approaches).
Try to run this code in the terminal:
# man is short for manual, to quit, press "q"
# use "k" or the up arrow to scroll up, or "j" or the down arrow to scroll down.
man grep
Using '!'
The ! method allows you to run a bash command from within the same cell as a cell containing Python code.
For example, in your Jupyter notebook, you can run this code in your cell:
!ls
import pandas as pd
myDF = pd.DataFrame({'col1': [1, 2, 3], 'col2': [4, 5, 6]})
myDF.head()
!echo "Hello World!"
Magic cell
Code cells that start with % or %% are sometimes referred to as magic cells. Any cell that begins with %%bash will run the bash code in that cell.
A cell will not know what code another cell runs. For example, if you create a new variable in a cell and then write some new bash code in a new cell, the new cell will not recognize the variable from the previous cell.
|
To see a list of available magics, run The commands listed in the "line" section are run with a single |
|
To answer the project questions, please run your bash code using the |
Before moving on to the questions, you may also want to take a look at this video, which shows different ways of running the BASH command on Anvil:
|
Dr. Ward has some more example with |
Questions
Question 1 (2 points)
The / is the root directory in a UNIX-like system. You can think of it as the top-level folder that contains all other folders on the system. For example, /home is a folder located within the root directory.
The $HOME variable refers to the absolute path of your personal home directory. Inside the root / directory, there is a folder called home, which contains a subfolder named x-<your-username>. This x-<your-username> folder is your personal home directory.
Let’s explore more by doing some exercises below.
-
Write a bash command to display both your home directory (
$HOME) and your current working directory (pwd). These two directories should be the same. Ensure you run the command in the terminal immediately after opening it, without making any changes to the home directory sidebar. -
Write a bash command to change your current directory to
/anvil/projects/tdm/datausingcdcommand. -
Run the same command from Step 1 above again.
-
Explain any observations you see in the results from Step 1 and Step 2. Explain the difference between
$HOMEandpwd.
Relevant topics: home, pwd, cd, echo
1a. Code used to answer Step 1, 2, 3
1b. Output from Step 1, 3
1c. Written answer for Step 4
Question 2 (2 points)
Relative paths are an important concept to understand, especially when you try to nagivate files and folders in a UNIX-like operating system.
. represents the current directory - you can think it as "here."
-
cd .means to stay in the current directory -
./myscript.shmeans to run themyscript.shfile in the current directory -
mv ./myfile.txt $HOMEmeans to move themyfile.txtfrom the current directory to my home directory
.. represents the parent directory, relative to the rest of the path.
-
cd ..means to move up one directory -
mv ../myscript.sh ./means to move themyscript.shfile from the parent directory to the current directory
Let’s explore more by doing some exercises below.
-
Write a bash command to change your current directory to
/anvil/projects/tdm/data/zillowusingcdcommand. -
Run each of the commands individually and print the current working directory for parts
a–d. After executing each command, make sure to return to the/anvil/projects/tdm/data/zillowdirectory. Explain the functionality of each command based on your observations.-
cd -
cd . -
cd .. -
cd ../../ -
lsorls . -
ls -laorls -la . -
ls ../
-
Relevant topics: pwd, cd, ., .., ls, echo
2a. Code used to answer Step 1, 2
2b. Final current working directory for a, b, c, d
2c. Output of e, f, g
2d. Written explanation of each command does
2e. How does using relative paths benefit you for particular commands like ls? Hint: check your current working directory for g.
Question 3 (2 points)
There’s a quick way to get some information about a file without the need to read them in first like R and Python.
Quick Tip: Tab completion is a very handy trick. When you partially type a directory name, you can press the tab key to see all available options — or it will autocomplete if there’s only one match if it’s in terminal. Give it a try!
cd /anvil/p # then hit the tab key then enter
-
Go to
/anvil/projects/tdm/data/icecream/breyers -
Print the first five rows of
reviews.csvusinghead -
Print the last five rows of
reviews.csvusingtail -
Print only column names (first row) of
reviews.csvusing-noption -
Run
wc reviews.csvand identify which parts of the output represent what information -
Get the line count only for the given file using the
-loption
Relevant topics: head, tail, wc
3a. The code used to solve all the steps above
3b. The output from Steps 2, 3, 4, 5, and 6
3c. A written explanation for Step 5 (describing the parts of the wc output)
|
For more practise, please refer to Dr. Ward’s following video which includes examples with |
Question 4 (2 points)
Those in the following directories have been discussed:
-
$HOMEor/home/$USER: your home directory -
/anvil/projects/tdm/: TDM directory -
/anvil/projects/tdm/data: where public data lives in TDM directory
There’s one more directory you should know about: $SCRATCH or /anvil/scratch/$USER
Run this command below to see your quote and usage (myquota-this command works only from terminal):
myquota
-
What are the size limits for your home directory and
scratchdirectory? -
Copy the
reviews.csvfile to your SCRATCH directory usingcp -
Copy the entire
icecreamdirectory to your SCRATCH -
Print the list of files and folders in your SCRATCH directory
-
Delete the copied
reviews.csvfrom your SCRATCH -
Delete the copied
icecreamdirectory from your SCRATCH -
Print the list of files and folders of your SCRATCH directory again
Relevant topics: cp, rm, rmdir
|
Dr. Ward shows moving some large files in the following video. You can compare your SCRATCH directory space (myquota-this command works only from terminal) with what Dr. Ward says in the video. Is it the same? Also, there are additional examples with |
4a. Written answer for Step 1 (size limits for home and scratch directories)
4b. Code used to solve Steps 1 through 7
4c. Output from Steps 1, 4, 7
Question 5 (2 points)
-
Create a new directory called
mydinnerin your home directory -
Inside the
mydinnerdirectory, create the following files using the touch command:-
spaghetti.txt -
bread.txt -
broccoli.txt -
smoothie.txt -
tiramisu.txt -
Optional: Feel free to create additional files for other dinner items you enjoy
-
-
Display the contents of the
mydinnerdirectory usingls -
Edit each of the files to include the following ingredients:
-
spaghetti.txt: noodle, tomato sauce -
bread.txt: bread, garlic, butter, cheese -
broccoli.txt: broccoli, salt, pepper -
smoothie.txt: strawberry, banana, milk -
tiramisu.txt: top-secret tiramisu recipe from granny -
Optional: Add ingredients to any additional files you created
-
-
Use the
catcommand to print the contents of each file -
Move the
mydinnerdirectory to SCRATCH and rename it tomybreakfast -
Display the contents of the SCRATCH directory
-
Delete the
mybreakfastdirectory
Relevant topics: mkdir, touch, cat, vi, echo, >>
5a. Code used to solve all the steps above
5b. Output from Step 3, 5, 7
Submitting your Work
Once you have completed the questions, save your Jupyter notebook. You can then download the notebook and submit it to Gradescope.
-
firstname_lastname_project1.ipynb
|
You must double check your You will not receive full credit if your |