ktgnair.github.io

Follow me on GitHub

Assignment 1 / My Experience

Overall my experience while doing this assignment was challenging as well as great.

While i was doing this program,i was practically experiencing the most famous English Idiom ‘Don’t judge a book by its cover’. After reading the question for the first time i felt whooe now i just need to reverse the file contents but by looking at the program constraints my mind started to look for answers and different approaches.

The Constraints are as follows:-

  • Test with very large inputs files, e.g. 5GB.
  • The program should be robust in that it should be independent of the file size. The memory requirements should be more or less same for input files of varying sizes 1 KB, 5 KB, or 5 GB
  • Also give memory and speed reading for 1 GB, 5 GB.

I was able to learn different things like internal working, functions, Unix Commands, Naming Conventions and debugging methods which i will be showcasing in the coming lines.

The first thing while doing the program was to create a text file which is of the size 5gb.

For doing that this is something that i did and was able to generate the file.

#include <stdio.h>

int main()
{
    FILE *fp=NULL;
    long int size=0;

    int i,n;
 
    
    fp=fopen("1gb.txt","w");
     

    /*for(i=0;i<=9999999;i++)
        fputs("1 My name is ktgn 2 This is 2 This is a 1 gb file",fp);

    fclose(fp); 
    return 0;
}

There are other approaches also of which one is using UNIX command ‘cat’ which helps in merging the files.
So if you have a 1gb file and you need to generate a 5 gb file just type this

cat 1gb.txt 1gb.txt 1gb.txt 1gb.txt 1gb.txt > 5gb.txt

The next thing is finding the size of input file

Use ‘stat’ function as used below

#include <stdio.h>
#include <sys/stat.h>

int main()
{
   int n;
   
    	struct stat st;
	stat("1gb.txt", &st);
  	n=st.st_size;
	printf("%d",n);   
    return 0;
}

One way of doing that is using UNIX command which is shown in the below block

ls -l filename

My first way of doing the code was using fseek()
What fseek() does is that it helps us to move the file pointer position to a given location.

Syntax:int fseek(FILE *pointer, long int offset, int position)

where position provides us 3 options those are
SEEK_END : It denotes end of the file.
SEEK_SET : It denotes starting of the file.
SEEK_CUR : It denotes file pointer’s current position.

For creating temporary file when doing my final approach i used sprintf() so that i can get different file name sequentially.

When i had finished coding i realized that making the code in a readable format is also necessary so that when in future some guy looks into your code he should be able to understand it quickly instead of having doubts.

  • So to make that happen i changed the names of the variables from my code like for example int n was changed to int file_size because the variables actual function is to store the file size.

  • Next is to avoid all unwanted tabs, spaces.