Linux Basics : Part10 : Text Processing

In the last post we saw some basic commands and how to get help on them. In this post we will explore some text processing tools which will help you view and process the contents of a file or command as desired.

Viewing.

To just view the contents of a file you can use the commands like “cat” and “less“.

1. “cat“: The “cat” command will show the entire file at once and you will have to scroll up and then down again to read it.

#cat file1

But if the file is large it may become difficult to do so, that is when you can use “less” command.

2. “less“: The “less” command will let you view the contents of a file page wise and you can either scroll up line wise using the arrow keys or page wise using the space bar.

#less file1

You can also use “less” to pipe the output of a command to view it page wise or line wise.

#dmesg | less

Viewing Excerpts.

To just view the file excerpts instead of viewing the entire file you can use the commands like “head” and “tail

1. “head“: This will display the first 10 lines of a file. 10 lines is default, if you want more or less lines you can use the “-n” flag.

#head file1
#head -n5 file1

Similar to the less command you can also use this command in conjunction with others commands to view its output in a controlled manner.

2. “tail“: Similar to the “head” command, the “tail” command will display the last 10 lines of a file.
If you want more or less lines you can use the “-n” flag.
Also if you want to view the updates or additions in file continuously you can use the “-f” flag.

#tail file1
#tail -n file1
#tail -f file1

The “-f” flag is especially useful when viewing and monitoring logs.

Formatting Output.

To format the output of a command you can use the commands like “sort“, “wc” and “cut“.

1. “sort“: This command sorts the text of an output and gives a sorted standard output, but the original file remain unchanged.

sort” can be used with various flags:

-r : To perform a reverse sort.
-n : To perform a numeric sort.
-f : To ignore case of character in string.
-u : To remove duplicates in output.

2. “cut“: Can be used to display a specific column of a file or output using a delimiter as a reference and “-f” for the column number. To use a “:” as a delimiter you can use the command as below:

#cut -d: -f1 /etc/passwd

3. “wc“: To count the number of lines, word or character in a file or in an output, you can use the "wc" command.

#wc -l /etc/passwd
#wc -w file1

Search Filter.

To apply filter to a file or an output, you can you “grep” command. “grep” is a search filter and takes a variety of parameters. It prints the line of a file or output when the pattern is matched.
To just search for a word in a file:

#cat file1 | grep searchword or #grep user1 /etc/passwd

To ignore case you can use the parameter “-i“.
To reverse the search or to exclude the words in the search word use the parameter “-v

Advance Formatting.

To advance format and print you can use the “awk” command.
The use of “awk” command is vast. There is an entire book explaining its uses, however here it I have mentioned only the most used case which is sorting and printing.

If you want to print usernames from /etc/passwd file you can use the below command.

#cat /etc/passwd | awk {'print $1'}

To print two elements from the same line using spaces, you can use:

#cat /etc/passwd | awk {'print $1 " " $2'}

There are many other uses of awk, feel free to explore it as it’s mastery will help you a lot in scripting.

Editing using Stream Editor.

To edit a file with or without opening it you can use “sed” command. “sed” is a stream editor.
You can use sed to search for a specific word in a file and replace it with something else.

#sed -i 's/searchfor/replacewith/g' file1

where “s” stands for substitute.
i” option updates the file.
g” means global replace. Removing “g” will only replace the first occurrence.

To use a keyword for searching and replacing:

#sed -i '/keyword/s/searchword/replaceword/' file1

The above will search for “searchword” and replace it with “replaceword” only if the line contains the “keyword“.

If you would like to backup the file before making changes replace “-i” with “-i.bak“.
This will backup the file in the same location from where you are running the command.

#sed -i.bak 's/searchword/replaceword/' file1

Well that’s it for this post, make sure to practice these. See you with the next post of blog.avoidingtech.com.

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.

error: Content is protected !!