Monday 25 August 2014

Process in Linux : An Overview



Well, enough of the GUI stuff. Now I think it is the time to move inside Linux. Let us have a little glimpse of the process management done in Linux.
By definition, any instance of a particular program in execution at a time is called a process. So a program, when is being executed, is called a process. The concept of process holds for all modern OS, and hence an understanding of the same helps us understand our OS.



To list the process, we can use the ps command. Type in the following to get a long listing of all processes in the system:

ps -el

The ‘e’ option selects all process, and the ‘l’ option is for long listing mode of display.
Each process has a distinct ID called the process id or ‘pid’. Also, its parent process id is thus referred as ‘ppid’.
To kill a process use the ‘kill’ command with the process id. For example, the following command will end the process with pid 20170:

kill 20170

Don’t you think it is faster than using CTRL+ALT+DEL -> select a process -> end process in Windows ?
Or if you are fond of the GUI side, simply type ‘xkill’ in the terminal. Your cursor will change to a skull. Now you simply need to click on a program window you want to close.  Try it, it is rather fun!


Beware! If you click it on your desktop, your desktop will be gone, and you may have to log out again to bring it back. Though many distributions are intelligent enough to bring the desktop again along with a ‘crash menu’ reporting the unexpected closing of your desktop, without making you log out.

Now let us write a code to create a process. Basically a process is created from another process by using the ‘fork()’ system call. The new process is called the child process.  fork() returns an integer, which may have value:
  • fork() < 0 : This means child process creation failed.
  • fork() = 0 : Child created, executing child process.
  • fork() > 0 : Child created, executing parent process.

System calls to get the pid and ppid are ‘getpid()’ and ‘getppid()’ respectively.
Here is a C code to create a process:

#include <stdio.h>
int main()
{
      printf(“ Current process id: %d\nParent process id: %d”, getpid(), getppid());
      int v=fork();
      if(v<0) printf("Child process creation was not successful!\n");
      else
      {
            if(v==0)
            {
                // code for child process
                printf("Child process started with pid:%d and parent id:%d",getpid(),getppid());
                printf("\nChild prints 1 item\nChild will exit now\n");
            }
          else
           {
              // code for parent process
              printf("Parent process running with pid:%d\n",getpid());
              printf("Parent prints 1 item.");
            }
      }
      printf("Exiting process %d with parent pid: %d\n",getpid(),getppid());
      return 0;
}

Two special categories of process which are worth mentioning are orphan process and zombie process.

Orphan Process:

In the long listing of process, the second column indicates whether the process is sleeping (S) or running (R). If a parent process is killed before its child is killed, the child becomes a ‘orphan process’. In Linux, the very first process is the ‘init’ process. As Linux maintains a tree structure for the processes also, hence an orphan process is a direct violation. Hence, any orphan process is adopted by the init process. Thus there are no orphan process in Linux.
To demonstrate this, let us have a look to this code:

#include <stdio.h>
#include <stdlib.h>
int main()
{
      printf(“ Current process id: %d\nParqent process id: %d”, getpid(), getppid());
      int v=fork();
      if(v<0) printf("Child process creation was not successful!\n");
      else
      {
            if(v==0)
            {
                 // code for child process
                printf("Child process started with pid:%d and parent id:%d",getpid(),getppid());
                printf("\nChild prints 1 item\nChild will sleep now\n");
                sleep(12); //sleep the child for 12 seconds
             }
            else
            {
                // code for parent process
                printf("Parent process running with pid:%d\n",getpid());
                printf("Parent prints 1 item. Parent will be killed now.");
                exit(0); // kill the parent
             }
     }
     return 0;
}

Say you compile it as ‘orphan.out
In the above code, parent gets killed just after creating the child and printing a message on the screen.  Meanwhile the child process sleeps for 12 seconds. In these 12 seconds, simply open up a new terminal and list all the process. Notice the ppid of ‘orphan.out’ is same as that of ‘init’, indicating that now init is the parent of ‘orphan.out’.

Zombie Process:


The name is funny, isn't it? Well it is a good analogy of real world zombies. In ‘Linuxworld’, a zombie is a child process which gets killed before the parent process. The child process, though non existent, is still mentioned in the process register, with a ‘zombie flag’, indicated by ‘Z’ in the second column. The process occupies no resource, except for the pid. Though zombies pose no threat when in less number, they can be very annoying for the system when in large numbers. This is because, Linux has a limited number of pids and it may fall short for pids as they have been already occupied by zombies. Thus no more processes can be started. The process register clears the zombies after a system call, like ‘wait()’ or after the parent is killed. To create a zombie process, you can use this code:

#include <stdio.h>
int main()
{
     printf("Process started...\n");
     printf("process id:%d\tparent process id:%d\n",getpid(),getppid());
     int v=fork();
     if(v<0) printf("Child process creation was not successful!\n");
     else
     {
           if(v==0)
           {
                printf("Child process started with pid:%d and parent id:%d",getpid(),getppid());
                printf("\nChild prints 1 item\nchild will exit now\n");
           }
           else
           {
                printf("Parent process running with pid:%d\n",getpid());
                printf("Parent prints 1 item\nSleeping parent for 20 seconds ");
                sleep(20); //sleep the parent for 20 secs
           }
     }
     printf("Exiting process %d with parent pid: %d\n",getpid(),getppid());
     return 0;
}

Compile this code as ‘zombie.out’. Run the code, and open a new terminal to list the processes.  Notice the ‘Z’ flag in the child process id.

I hope this post helped you get closer to Linux more. Feel free to share your questions and knowledge in the comments below.... till then , happy Linuxing :D !

Sunday 24 August 2014

Its time to Zip!




Hey guys, sorry for posting so late. I was busy these days due to my semester examination.

Today we are going to discuss about file compression and archiving in Linux. For the Windows users, this may be a bit unconventional. In Windows, you just need to click the compress option in the menu to get a “ZIP Archive” of the selected files and folders. What I mean to say is, Archiving and Compression of files are done altogether. Many Windows users may not even differentiate the difference between compression and archiving. Though the same can be done in Linux by a similar GUI tool, but things done from the command line have their own charm, as I say ;)

Let us make the concepts clear first.

Archiving is concatenation of two or more files into a single file, called the archive file. The total size of this archive file is approximately equal (but not less) to the sum of the files contained in the archive.

On the other hand, Compression is reducing the size of the file on the disk by using certain algorithms. Obviously, if decompression results into the exact file, the algorithm is loss-less, else lossy.
It is a common practice to create an archive of the selected files and then compress it to get the final “compressed archive”. Unlike Windows, in Linux compression and archiving are two different processes. For archiving the most common format is '.tar', which stands for tape-archive.
Once we make the archive, we are ready to compress it using the various compression tools available. The most common tools are gzip, bzip2, zip, 7z, etc. Out of these, 7z has the highest compression ratio but is the slowest. The basic trade-off is between  speed and compression ratio. The most popular tools in Linux are gzip and bzip2. Here we will use the gzip tool, which gives the compressed file extension '.gz'. It is worth mentioning that bzip2 has higher compression ratio than gzip, but is slower than gzip. For general usage, I would always recommend gzip.
The basic strategy is converting the files into an archive, and then compressing it using gzip. Thus a file 'myFile' gets converted into 'myFile.tar' and then into 'myFile.tar.gz'
Luckily, we can do these two things using only one command line tool, 'tar'.
Open the terminal, and select a folder to compress. Say, we select 'directory1' which contains large number of text files.
Now go the parent directory of 'directory1' and enter the following command:

tar -czvf myArchive.tar.gz ./directory1

Note: If we wanted to do the same with files instead of the entire directory, simply enter:

tar -czvf myArchive.tar.gz file1 file2 file3 (...and so on)

Your compressed archive will be created bearing the name 'myArchive.tar.gz' . The .tar.gz extension describes that the directory was first archived into 'myArchive.tar' and then this 'myArhcive.tar' was compressed into myArchive.tar.gz using gzip tool. Alternatively we can give the file name as 'myArchive.tgz'  to show the same. The .tgz or .tar.gz files are also called tarballs.
To extract the file, use the following command:

tar -xvzf myArchive.tar.gz

Now let us explore the tar command in a bit detail:
The basic structure is:
tar <options> <archive name> <file1> <file2> <file3> ....
where the options are:
  • c: Create Archive – Create a “tar” archive
  • z: Zip/ Unzip the archive using gzip ( for bzip2, use “j” )
  • v: Verbose-List the files processed You may exclude this option, but it is a good practice to see what “tar” is doing with your files
  • f: File Archive- Use the archive file. It is a bit complex option, but for now just take it as a good practice to use this option. It asks tar to take the archive to be created for compression.
  • x: Extract archive files from the archive.


The tar tool has numerous such options. For a detailed list, you can view its man page by entering:

man tar

Further, you may first archive the file and then compress it manually using gzip command line tool. For this first create a tar (do not use the “z” option). Now use the gzip tool to compress the tar file. If you want to decompress a .gz file, use the 'gunzip' tool. See the man page of gzip for more details.
Compressed archives are a very good way to backup or efficiently store files which are not required at present. Being a coder, I frequently archive my programs to free up disk space. For simple documents, the compression ratio may be as high as 90% . For example, my folder containing C and  C++ codes with a total size of 106 MB approx, was reduced to 4.43 MB after converting it into a tarball.
But before this news starts making you excited enough to set up your hands at tar, let me give you an important tip:
Do not try to compress multimedia files (Pictures, Videos, Music) as your work will go in vain. This is because most of the multimedia files are already compressed, so you will get very poor compression. PDFs also do not get compressed much, but they can be compressed if you want to free up some MBs. Once I tried to compress 14 movies with a total size of 20 GBs. The entire process took 30 minutes and as a result I got a tarball of 19 GBs. So it was a big waste of my 30 minutes. With 7z, it  took obviously much more time and I had to kill the process in between as I was getting bored ( :D ).