Grep is a command used in Unix/Linux for searching a pattern in files, group of files, or an input stream. It is a very powerful that makes use of Regular Expressions to find the desired pattern/ output. It comes with a lot of options. We will only see a few.
Here, we will only discuss the basics of using grep and see how we can use it to search particular strings in a file.
If you’re reading this, it is to be automatically assumed that you have a running Unix/Linux platform to have a hands-on experience on what you’ll be following. However, in case you are not able to get your hands on one, I’d recommend you to use this Online Unix Terminal to practice.
Although, I highly recommend that you try it on a real Unix/Linux platform.
If you’re using the simulator, you can directly start typing commands at the prompt ($).The first thing that I want you to do is create a .txt file that will contain strings of different patterns so that we can extract them separately. Go ahead and type in the following and hit Enter.
After hitting Enter, you will not see a $ like you previously did. This means that your command is not yet over. In the new line, write the contents of your file. This is what my code looks like after I type in the text in the file1.txt.
cat> file1.txt hello world THIS this is great 100 to 1000 bye world
After you’re done inserting your contents into the file, press CTRL + D to say that you’re done. You’ll notice that the prompt $ is back.
To search “hello” in the file: file1.txt, type in the following command in the Terminal –
grep "hello" file1.txt
This should give you the output –
This is because the command will search the line where “hello” appears and print that entire line to you. To get the line number as well, we’ll use one of the options I told you about earlier (-n).
grep -n "hello" file1.txt
This should produce –
The “1” here represents line #1
Similarly, we’ll see another example to clarify your doubts, if any, by searching the word “this“.
grep -n "this" file1.txt
2:THIS this is great
Now, I hope you have a question here. Which “this” do you think the grep command fetched. Since both ‘THIS” and “this” are on the same line, it might be confusing for you. So let’s append two more lines to file and see the result then.
cat>> file1.txt this THIS
CTRL + D
If we run the command again:
grep -n "this" file1.txt
2:THIS this is great 5:this
We notice that the 6th line with “THIS” does not get printed. This is because the default setting for the grep command is Case Sensitive, i.e., A is not equal to a – they are treated differently. But what if we want to search the word irrespective of the case. Here comes another option.
To search a particular word, not giving importance to UPPERCASE or lowercase, we use the option -i as given below
grep -in "this" file1.txt
Output is as follows:
2:THIS this is great 5:this 6:THIS
Here, all the “this” are matched without giving importance to case.
Note: options -n (line number) and -i (case insensitivity) can be combined together to form -in.
Let’s try another feature. What if you want to fetch all lines except the ones that contain a particular string/pattern? This can be achieved using -v in the following way:
grep -v "this" file1.txt
(Note that this is case sensitive)
It will give –
hello world 100 to 1000 bye world THIS
The above examples were to search an exact string. What if we want to search a pattern? We execute such a motive using Regular Expressions. Regular Expression is a domain that will require a post of its own. And honestly speaking, I really don’t want to confuse you with the regex now. So we will try and see a simple example in hopes that I can get my point through.
So for example, if you want to extract all numbers from a file. This is how we will proceed.
grep "[0-9]" file1.txt
Its output is:
100 to 1000
What needs to be explained here is “[0-9]”. We already know that we put the string to be searched inside quotations. So that explains first and the last character of it. The difference between the previous examples and this example is that in this case, we’re searching for a pattern than providing with a direct and immediate string. Therefore, the method that we used for specifying the string/pattern is different. [0-9] says that the string must have a component which lies between 0 and 9 (that is, all the digits).
If I go on to explain Regular Expressions, this post will take another twice as length maybe. The above example was just a glimpse at how powerful regex can really be. Tell me in the comments below if you’d want me to do a post on regex (or specifically, regex with grep).