Log   |   Assignments   |   Source   |   Discussion   |   Feedback   |   Summary  |   About Me  |

This page tracks the course work progress.

2011-04-16

It's been a while since I updated the discussions page. I've been wanting to write a lot, but I'm sometimes overwhelmed by the amount backlog I have! I've tried to write something about filesystems now.

2011-04-12

I've started looking at the code for 68ASMSIM, but I don't think I will have anything to show the Professor tomorrow.

2011-04-08

I stumbled upon an article on finding the physical location of a user within ~700 m using just the IP address. What's more interesting was one of the user comments:

"Back in the early 80's a Physic's [sic] grad student at Berkley was working in their data center and noticed a descrepency [sic] in user usage statistics and started investigating. He was able to isolate the user ID of the unauthorized user by analysing the usage statistics. At the time the user statistics were used for billing computer time. The user was basically trying to use the Berkley system as a proxy for attacks on other systems. He eventually spliced into the network to intercept packets containing the User ID in question and calculated the amount of time it took for those packages to complete a round trip to determine the geo location of the person hacking into the system. At first he thought he was wrong because his calculations based on signal response time said the unauthorized user was 6000 miles away. He later discovered the calculation was correct and the hacker was located in Germany. He published a book called "The Cuckoos Egg" with all the details. It is a really good book. "

I'd totally forgotten about this book after the professor mentioned it in class. Today, he referenced it again, and now I see this comment - I should get a copy asap and read it!

The article I was reading is interesting too - Read

2011-04-06

Had a close encounter with Murphy's Law today, but it all ended well, fortunately. I had loaded FreeDOS on my 32GB USB drive and made sure everything worked on my laptop. I came to the lab today, confident that I had everything figured out, plugged my drive into one of the machines and tried to boot from it. The boot loader came up and stopped with an error saying that it couldn't find command.com. I first thought there was a corruption in the drive, but when I looked at the error messages, I realized that the BIOS in the DCF machines somehow doesn't support DOS drives larger than 8GB!!! There were error messages saying LBA ( Logical Block Addressing) support was not enabled. I still find it strange, because those systems are not that old (On a partially un-related note, it's a bit disappointing to have to work with systems powered by Celeron processors!)

I usually look at programs that run on one system and not on another with suspicion and I was sad that I'd have to tell the Professor exactly that (again!). I had no choice but to tinker with Qemu and try to get it run my code. I'm happy that I had a "Eureka" moment at that time and it suddenly dawned to me that Qemu could be allocating too little memory to DOS, resulting in conflicts between the segments in my program and the video memory. This would also explain why the program goes crazy only if the graphics mode is enabled. I allocated more memory using the '-m 192M' option and everything started working!

This is still entirely a hypothesis and I don't have any proof, yet. Documentation says video memory starts at B8000 (Text) and C0000 (Graphics), and I have not consciously used any location in that range. So I have to do a bit more reading to understand what exactly happened. Anyway, I was able to get Bresenham's Algorithm running on the DCF machine before the Professor came to see the output :)

I think we're (at least I am) passing through some "Murphy's Law-Prone Zone" - My wife ran into problems with her phone during an important conference call yesterday evening - Some un-expected guests turned up during the call and the phone wouldn't let her mute the call!

References:
PCGuide Ref - LBA
PCGuide Ref - Hard Disk Size Barriers - Very useful site on various size barriers

2011-04-05

My program still has the problem in Qemu, but it runs well if I boot my laptop into DOS and run it without using any Virtualisation tools. I hope to show that in tomorrow's lab session and hope the DCF machine doesn't let me down!

2011-03-30

Ran into new problems at the lab. My program works perfectly with VMWare, but goes into an infinite loop (And draws crazy stuff) when using qemu. If the program is debugged in text mode, everything works perfectly. I can see the CX and DX registers getting proper values and all loops / calls are being executed properly. I still don't have a clue why this is happening.

I've uploaded the course summary - Read.

2011-03-29

Identified the problem with the input routine. All my segment registers were pointing to the same address, resulting in a total mess. I am surprised that the subroutine call succeeded even once! I've changed the binary format to .exe and initialized all segment registers properly. Now the program works fine. nasm did not allow me to define the segments properly while using the .com format. I'm still not sure why that happened. I have to ask the professor about that.

Some basic validation is in place - the user is not allowed to enter co-ordinates outside the screen dimensions. While reading the input from the user, each character is not ensured to be numeric. If an alphabet/symbol is entered, the program proceeds with (ASCII Code & 0x0F) as the input value.

I will be showing this in the lab session tomorrow. The professor had told me that it's possible to smoothen steep lines using anti-aliasing techniques and that there are techniques using only integer operations. I haven't yet found how to do it and am yet to do an exhaustive search for the algorithm. I did look at Xiaolin Wu's algorithm, but that requires a divide operation. I have to think whether it's possible to implement it using fixed point operations.

; Bresenham's Algorithm
; Based on the algorithm listed under section "Simplification"
; in Wikipedia
; Reads (x0,y0) and (x1, y1) from the user
;
; Author: Kurian John
; Date  : 2011-03-29
;
; Code
;
segment code
..start:
	mov ax, data
	mov ds, ax
	mov ax, stack
	mov ss, ax
	mov sp, stacktop
redo:
	; Initialise stuff
	mov [ds:err], word 0
	mov [ds:isx], word 1
	mov [ds:isy], word 1

	; Display welcome message
	mov dx, wel
	mov ah, 9
	int 21h
	mov dx, int
	mov ah, 9
	int 21h
	mov dx, exm
	mov ah, 9
	int 21h

readx0:	
	mov dx, mx0
	mov ah, 9
	int 21h	
	call readinp
	cmp dx, 320
	jl x0ok
	mov dx, xer
	mov ah, 9
	int 21h
	jmp readx0	
x0ok:
	mov [ds:x0], dx	

ready0:
	mov dx, my0
	mov ah, 9
	int 21h	
	call readinp
	cmp dx, 200
	jl y0ok
	mov dx,yer
	mov ah, 9
	int 21h
	jmp ready0
y0ok:
	mov [ds:y0], dx

readx1:
	mov dx, mx1
	mov ah, 9
	int 21h	
	call readinp
	cmp dx, 320
	jl x1ok
	mov dx, xer
	mov ah, 9
	int 21h
	jmp readx1
x1ok:
	mov [ds:x1], dx

ready1:	
	mov dx, my1
	mov ah, 9
	int 21h	
	call readinp
	cmp dx, 200
	jl y1ok
	mov dx, yer
	mov ah, 9
	int 21h
	jmp ready1
y1ok:
	mov [ds:y1], dx

	cmp [ds:x0], word 0
	jnz goahead
	cmp [ds:y0], word 0
	jnz goahead
	cmp [ds:x1], word 0
	jnz goahead
	cmp [ds:y1], word 0
	jnz goahead
	jmp sayonara
	; If all co-ords are zero, we are done
	;ret
goahead:

	; Init display, 320x200
	mov al, 13h
	mov ah, 00
	int 10h
	
	mov ax, [ds:x1]  
	mov bx, [ds:x0]	 
	sub ax, bx		; ax = x1 - x0
	jns sxpos	
	mov [ds:isx], word -1	; Set sx=-1 if x0>=x1
	neg ax			; dx was negative - Get abs
sxpos:
	mov [ds:idx], ax	; Save value of dx
	mov [ds:err], ax	; Value of dx in err	
	mov ax, [ds:y1]
	mov bx, [ds:y0]
	sub ax, bx		; ax = y1 - y0
	jns sypos
	mov [ds:isy], word -1	; Set sy=-1 if y0>=y1
	neg ax
sypos:
	neg ax			; Now [idy]=-dy for
	mov [ds:idy], ax	; ease of computations
	add [ds:err], ax 	; [err] = dx-dy
loop:
	mov cx, [ds:x0] 	; Get y co-ord in dx
	mov dx, [ds:y0] 	; Get x co-ord in cx
	mov ah, 0ch
	mov al, cl		; Color the point
	and  al, 77		; 
	int 10h			; Paint the point
	cmp [ds:x1], cx		; If we've reached
	jne stillon		; (x1, y1), then stop.
	cmp [ds:y1], dx		; Else continue
	jne stillon		;
	jmp done
stillon:
	mov ax, [ds:err]   	; err in ax
	mov bx, ax 
	shl bx, 1		; e2 = err*2 in bx	
	cmp ax, [ds:idy]
	jle cmp2
	mov ax, [ds:idy]
	add [ds:err], ax	; err-=dy
	mov ax, [ds:isx]
	add [ds:x0], ax		; x0+=sx
cmp2:
	cmp bx, [ds:idx]
	jge loopend
	mov ax, [ds:idx]
	add [ds:err], ax	; err+=dx
	mov ax, [ds:isy]
	add [ds:y0], ax		; y0+=sy
loopend:
	jmp loop
done:
	mov ah, 00		; Wait for a key
	int 16h			;	 	
	mov ah, 00		; Choose video mode setting
	mov al, 03		; Choose text mode 80x25
	int 10h			; Set the mode
	;ret			; Return
	jmp redo
sayonara:
	mov ah, 9
	mov dx, tky		; Say thank you
	int 21h	
	mov ax, 4c00h		; And exit
	int 21h

;
; Subroutine to read input from the user
; This reads input character by character
; and after each char, multiplies the existing
; value by 10 and adds current char
; This is done for a maximum of three characters
; or till a return key is pressed
; There is no validation here - we expect numbers
; only, but if someone enters a char, we
; take the ASCII code and proceed
; Read value will be stored in dx on return
;
readinp:
	mov dx, 0
	; Read character to al
	mov bl, 2	; Process max 3 numbers
loopr:
	mov ah, 01h
	mov al, 00h
	int 21h
	cmp al, 0Dh	; Stop if we got an enter
	je done1
	
	; Process input
	mov ah, 00
	and al, 0fh	; Get number alone in al
	xchg ah, dh	; Now result is in ax and
	xchg al, dl	; read value in dx
	mov cl, 10
	mul cl		; Multiply existing result by 10
	add ax, dx	; Add the digit that we read now
	xchg ah, dh	; Now result in dx
	xchg al, dl	; 
	
	cmp bl, 00h
	je done1
	sub bl, 1
	jmp loopr
done1:
	; Put a CR+LF 
	push dx		; Save dx coz int 21 affects dl
	mov ah, 2
	mov dl, 0dh
	int 21h
	mov ah, 2
	mov dl, 0ah
	int 21h
	pop dx 
	ret

segment data
mx0:	db 	'Enter x0: ','$'
my0:	db	'Enter y0: ','$'
mx1:	db	'Enter x1: ','$'
my1:	db	'Enter y1: ','$'
wel:	db	'Welcome to the Line Plotter!',13,10,'$'
int:	db	'Enter the start (x0,y0) and end (x1,y1) points',13,10,'$'
exm:	db	'Provide all 0s/returns to exit...',13,10,'$'
tky:	db	'Thank you for using the Line Plotter!',13,10,'$'
xer:	db	'X co-ordinate should be between 0 and 319!',13,10,'$'
yer:	db	'Y co-ordinate should be between 0 and 199!',13,10,'$'
x0:	dw 0
y0:	dw 0
x1:	dw 0
y1:	dw 0
isx:	dw 0
isy:	dw 0
idx:	dw 0
idy:	dw 0
err:	dw 0
res:	dw 0

segment  stack stack
	resw 50
stacktop:
2011-03-25

I'd forgotten to upload the updated code for Bresenham's Algorithm. This still uses hard-coded values for start / end points as I'm still having problems with calling the input routine. After the second/third invocation, the routine returns to some (seemingly) random address. The professor asked me to see if there are any additional initialisations required for calling INT 21h. In first look, it doesn't seem to have other init requirements, but I haven't looked at what happens inside the ISR. Have to do that. The code is listed below:
; Bresenham's Algorithm
; Based on the algorithm listed under section "Simplification"
; in Wikipedia
; Currently assumes that (x0,y0) is in (ds:x0, ds:y0) and
; (x1, y1) is in (ds:x1,ds:y1)
;
; Author: Kurian John
; Date  : 2011-03-22
;
; Variables for intermediate values
; Reserve one word (16 bits) each
;
SECTION .bss
x0:	resw 1
y0:	resw 1
x1:	resw 1
y1:	resw 1
isx:	resw 1
isy:	resw 1
idx:	resw 1
idy:	resw 1
err:	resw 1
res:	resw 1
;
; Code
;
SECTION .text
START:
	; Initialise stuff
	mov [ds:err], word 0
	mov [ds:isx], word 1
	mov [ds:isy], word 1
	
	; Start and end co-ords - Hard coded for now
	;mov [ds:x0], word 319
	;mov [ds:y0], word 0
	;mov [ds:x1], word 100
	;mov [ds:y1], word 100

	; Init display
	mov al, 13h
	mov ah, 00
	int 10h
	
	mov ax, [ds:x1]  
	mov bx, [ds:x0]	 
	sub ax, bx		; ax = x1 - x0
	jns sxpos	
	mov [ds:isx], word -1	; Set sx=-1 if x0>=x1
	neg ax			; dx was negative - Get abs
sxpos:
	mov [ds:idx], ax	; Save value of dx
	mov [ds:err], ax	; Value of dx in err	
	mov ax, [ds:y1]
	mov bx, [ds:y0]
	sub ax, bx		; ax = y1 - y0
	jns sypos
	mov [ds:isy], word -1	; Set sy=-1 if y0>=y1
	neg ax
sypos:
	neg ax			; Now [idy]=-dy for
	mov [ds:idy], ax	; ease of computations
	add [ds:err], ax 	; [err] = dx-dy
loop:
	mov cx, [ds:x0] 	; Get y co-ord in dx
	mov dx, [ds:y0] 	; Get x co-ord in cx
	mov ah, 0ch
	mov al, cl		; Color the point
	and  al, 77		; 
	int 10h			; Paint the point
	cmp [ds:x1], cx		; If we've reached
	jne stillon		; (x1, y1), then stop.
	cmp [ds:y1], dx		; Else continue
	jne stillon		;
	jmp done
stillon:
	mov ax, [ds:err]   	; err in ax
	mov bx, ax 
	shl bx, 1		; e2 = err*2 in bx	
	cmp ax, [ds:idy]
	jle cmp2
	mov ax, [ds:idy]
	add [ds:err], ax	; err-=dy
	mov ax, [ds:isx]
	add [ds:x0], ax		; x0+=sx
cmp2:
	cmp bx, [ds:idx]
	jge loopend
	mov ax, [ds:idx]
	add [ds:err], ax	; err+=dx
	mov ax, [ds:isy]
	add [ds:y0], ax		; y0+=sy
loopend:
	jmp loop
done:
	mov ah, 00		; Wait for a key
	int 16h			;	 	
	mov ah, 00		; Choose video mode setting
	mov al, 03		; Choose text mode 80x25
	int 10h			; Set the mode
	ret			; Return
	

2011-03-22

I've been trying to implement the Bresenham's algorithm using 8086 assembly and have been consistently running into problems. It's really hard to work with just 8 general purpose registers. The processors that I've worked with earlier (i960, ADSP 21xx) all have more than enough number of registers and I'm finding it really hard to come up with a working program! I thought I had everything right, but when I run the code, the system just reboots! :) Have to figure out what's wrong.

Edit1: 18:30
Got a very limited version running. It has the following limitations:
  • I've used 8-bit registers for storing positions and all other position-related variables. I did this thinking that the program will run faster, but I had forgotten the critical fact that the screen size is 320x200 and 8-bit signed representation is only from -128 to + 127! So, my program goes haywire when it sees input co-ordinates beyond (127,127).
  • The start and end co-ordinates are hard-coded. I've not implemented any I/O routine.
  • Horizontal and vertical lines can't be drawn, which is probably due to a bug, which I hope to fix before tomorrow morning.
  • Steep lines look *really* bad. They look more like an ant's trail than a line. I don't know if it's a problem with the algorithm of my program.
The code is listed below. It's more complex than it has to be because I wanted to minimise memory operations, but that has resulted in a buggy program with serious limitations! I think Knuth/Hoare was right when he said "Premature optimisation is at the root of all evil!"
; Bresenham's Algorithm
; Based on the algorithm listed under section "Simplification"
; in Wikipedia
; Currently assumes that (x0,y0) is in (cl,ch) and
; (x1, y1) is in (al,ah)
;
; Author: Kurian John
; Date  : 2011-03-21
; 
;
; x0    -       cl
; y0    -       ch
; x1    -       al
; y1    -       ah
; sx    -       bl
; sy    -       bh
; dx    -       dl
; dy    -       dh
; err   -       1200h

START:
        mov al, 13h
        mov ah, 00
        int 10h
        mov cl, 15
        mov ch, 15
        mov al, 25
        mov ah, 10
        mov bl, 1       ; Init sx, sy
        mov bh, 1       ; to 1
        mov [1200], word 0      ; Init err to 0
        mov dl, al      ;  
        sub dl, cl      ; dl = al - cl (x1 - x0)
        cmp cl, al      ; if x0 < x1 jump to sxpos
        jl sxpos        ;
        mov bl, -1      ; Set sx=-1 if x0>=x1
        neg dl          ; dl was negative - Get abs
sxpos:
        mov dh, ah
        sub dh, ch
        cmp ch, ah
        jl sypos
        mov bh, -1
        neg dh
sypos:
        neg dh          ; Now dh=-dy for ease of computations
loop:
        push ax
        push cx
        push dx
        mov dl, ch      ; Get y co-ord in dx
        mov dh, 0       ;
        mov ch, 0       ; Get x co-ord alone in cx
        mov ah, 0ch
        mov al, 1
        int 10h
        pop dx
        pop cx
        pop ax
        cmp cl, al
;       jne stillon ; Bug - to be fixed
; Following je causes problems for horiz/vert lines
	je done
        cmp ch, ah
; Following je causes problems for horiz/vert lines
	je done
;        jne stillon ; Bug - to be fixed
;        jmp done    ;
stillon:
        push ax
        mov al, [1200]   ; err
        mov ah, al
        shl ah, 1       ; e2 = err*2

        cmp ah, dh
        jle cmp2
        add [1200], dh
        add cl, bl
cmp2:
        cmp ah, dl
        jg loopend
        add [1200], dl
        add ch, bh
loopend:
        pop ax
        jmp loop
done:
        mov ah, 00      ; Wait for a key
        int 16h         ;       
        mov ah, 00      ; Choose video mode setting
        mov al, 03      ; Choose text mode 80x25
        int 10h         ; Set the mode
        ret             ; Return
I'm planning to make some enhancements to this one and will be showing this in tomorrow's lab session.

2011-03-21

Added modification times of relevant pages to the home page. I've used the iframe tag to do this, because I was not able to embed the required php script in to cal.html. It's supposed to work if a proper .htaccess file is provided in the directory, but I was not able to get it running. So, I've created a php file that just prints the modification times and embedded that php file in the iframe.

2011-03-20

Curious Case of the Missing RAM
It all started when my laptop ran out of memory while trying to open one of the plots we generated for a VLSI Design Automation assignment. So I ordered an extra 2GB of RAM and got it when I went home last week. The BIOS showed 4096MB and I left the shop happily. I noticed the problem only after reaching home. After booting the system, the OS would detect only 3GB of RAM! Both Fedora 14 and Windows 7 had this problem. After some googling, I realized that a normal 32-bit OS can see only 3GB of RAM! I found this strange, because 2^32 is 4GB and the system shouldn't have any issues addressing 4GB!

After further research, I found that this is due to the memory mapped IO (MMIO) addresses conflicting with RAM addresses. In most systems, the PCI bus supports 32-bit addresses and this is in the same address space as the main memory! So, when the BIOS sees a conflict between the MMIO and RAM addresses, it responds by disabling a large chunk of RAM! That is, even if the OS and processor supports greater than 4GB memory, the PCI address space is only 32-bit wide and will inevitably result in conflict and disabling of RAM chunks. BIOSs can work around this issue by having "holes" in the RAM address space. That is, RAM addresses that conflict with MMIO addresses are remapped to another (higher) address, leaving "holes" in the address space. This also requires a feature in the operating system called "Physical Address Extension" (PAE).

Though the virtual address space is still 32-bit wide in modern (32-bit) x86 processors, they have 36 physical address lines, allowing memory up to 64GB. It is, however, interesting to note that since the virtual address is still 32-bit wide, a single application can access only 4GB of memory at a time! My brother had once told me about this limitation, but it's now that I understand the reason.

The operating system does the translation between the 32-bit virtual address and the 36-bit physical address using page tables. When PAE is enabled, each entry in the page table increase to 64 bits. Also, additional bits are included in the page directory to indicate the page size. I just found that the NX (No eXecute) bit also resides here. The NX bit is used by operating systems to prevent arbitrary code execution, so that malicious programs cannot exploit security holes and execute code stored in the data area.

After reading all this, I went ahead and installed a kernel compiled with PAE support and now I can use see the entire 4 GB of RAM :)

I have seen fairly high end notebook computers with my friends here, with 3GB RAM. I always used to wonder why the manufacturer would ship something with 3GB of RAM, when they can ship 4GB with a marginal increase in price! Now I understand the reason - that would require them to ship 64-bit (probably costlier and less compatible) versions of proprietary OSs that are usually bundled with new systems!

In any case, Linux has saved the day and I can use what I paid for. (I still can see only 3GB in the proprietary OS!). Thank you, Linus and RMS :)

References:
3GB Barrier (Wikipedia)
Physical Address Extension (Wikipedia)
No eXecute Bit (Wikipedia)

I found the recent lectures about I/O devices and device drivers very interesting. I've been using FOSS for more than ten years now and I'm sad that I still remain a user and not a contributor. I do reply occasionally to forum posts and used to take part in the LUG activities back home, but I want to do more. I've noticed that (lack of) stable device drivers is a big setback for Linux systems and I wanted to learn how to write/debug device drivers and contribute back to the community. This remains one of the things I want to do the most while in campus. I hope I will be able to accomplish something before I leave! I'm sorry that I couldn't find time to write about these things in the discussions. I will make some entries in the coming week.

I've not started writing the summary yet. I've thought about the general structure, and now have to start writing.

2011-03-15

I have made some changes to the Quadratic Equation Solver UI. The text boxes are now editable completely (Insert / Delete anywhere) - View

I have also written a program to plot the mouse position with generous help from here - View source

I will be showing these two programs in tomorrow's lab session.

2011-03-08

I will be showing the line program in tomorrow's lab session. I have found two machines which have Qemu in DCF and have tested my program there.

I have not made much progress with the other programs. I will also show the results of the HTML Entity Search program for large input files. I used the gettimeofday function from <sys/time.h> (which returns time with micro-second precision) and got the following results:

File Size # Entities Time (ms)
4 MB 40194 180.32
12 MB 120582 530.96
258 MB 2451834 10534.82

The tests were run on my laptop which has a 2.27GHz processor.

2011-03-07

Downloaded NASM (Netwide Assembler), a free assembler for FreeDOS. Wrote a small assembly program and ran it.

A multi-coloured line


I used the following code to generate the line. This doesn't use any algorithm. All the program does is paint pixels from (10,10) to (130,130) with different colours. This program is inspired by Kamal Varma Kambe's program written for the CS410 course (Fall 2008).
START:
	mov bx, 120 	; The counter
	mov al, 13h	; Graphical 320x200 px / 256 colors
	mov ah, 00	; Choose video mode setting
	int 10h		; Set the mode
	mov cx, 10	; Position co-ords 
	mov dx, 10	; Position co-ords
LOOP:
	int 10h		
	mov ah, 0Ch	; Choose color setting
	mov al, bl	; Get the color value from counter
	int 10h		; Set the current pixel
	add cx, 01	; Increment pixel co-ord
	add dx, 01	; Increment pixel co-ord
	sub bx, 01	; Decrement counter
	cmp bx, 00	; Reached count?
	JNE LOOP	; Loop if not
EXIT:
	mov ah, 00	; Wait for a key
	int 16h		; 	"
	mov ah, 00	; Choose video mode setting
	mov al, 03	; Choose text mode 80x25
	int 10h		; Set the mode
	ret		; Return

I have written assembly code before, but never for an x86 and certainly never something which draws something on the screen. It was really nice to see the text mode change when I tried to single step the program using DEBUG. It was like going back to the old BASIC days !

The professor had told us last time that we shouldn't be using VMWare and that the outputs should be shown only on the DCF machines. I don't know if it's possible to work on the DCF machines using the liveCD, because there is (probably) no DOS compatible filesystem on any of those machines. One of the machines I checked had Qemu on it and so I've created a Qemu-compatible DOS filesystem on a file. It works on my laptop, but I will test it on one of the DCF machines today.

Reference:
8086 BIOS and DOS Interrupts (www.emu8086.com)


It's been some time since I touched my other programming assignments. Last week, I realized that my programs are all bloated and almost beyond repair. I am usually happy with the code that I write, but the Quadratic Equation Solver is really out of hand, probably because of the umpteen last-minute (Tuesday night) modifications. I can hardly bring myself to open them and start working. I should do something about this.

2011-03-06

I met the professor on Friday to talk about the missed deadlines. He told me that it happens because of lack of group work and not being used to regular, hard work and asked me to keep working. I will do that.

We got our first homework assignment with a hard deadline on Friday (2011-03-04) - An essay on what we've learned from the course. This essay has to be ready by the last day of March. I've just started thinking about the points to include it it.

Last two classes were about interrupts and communication between the CPU and the external world - Read Discussions.

2011-03-02

I have missed the deadline for assignments yet again. I need to think before setting a new dead line for myself since this is the third deadline I'm missing. I think the whole point of a self-imposed deadline was that we should set realistic goals and achieve them.

I have looked up some coding guidelines and updated the discussions page - Read

2011-03-01

I am developing an ncurses UI for the HTML Entity search program and hope to show that in tomorrow's lab session.
I have not made much progress with the Quadratic Equation Solver from last lab class. The things to be done are:
  • The UI allows deleting characters, but not inserting in between
  • Leading/Trailing spaces and space after the sign are not accepted gracefully.
  • Better precision reporting - Now I just say correct up to 5 digits when b*b >> 4ac, which I think, is not true always.
I will try to fix these issues before tomorrow's session, but I am not sure whether I will be able to.

I'm updating this log through a friend since I don't have access to the server from where I stay and so I will not be able to upload the code (Which will be ready only later tonight).

2011-02-24

I've played around with the UBCD/FreeDOS/DEBUG and got some interesting results. Confirmed that there is a jump instruction at FFFF:0000. More information in the discussions page - Read.

2011-02-23

The fifth version of the Quadratic Equation Solver (v0.5) with a simple ncurses UI is uploaded. This is still buggy and needs a lot of work. The source is available here

2011-02-22

After hours of tinkering with ncurses, I have a simple program that works (almost) satisfactorily. I've written the program for checking if a character is a vowel with a simple ncurses UI. The source can be viewed in the Source section.

I stumbled upon a small ncurses tutorial on building a simple text editor and it helped me a lot. You can read the tutorial here.

I will be showing this program in the lab session tomorrow. I am also working on a UI for the Quadratic Equation Solver and Entity Search programs, but I don't think I will have presentable results by tomorrow morning.
And I'm failing to meet another self-imposed deadline. I was supposed to complete both the assignments tomorrow, and I am most certain that I will not be able to meet the deadline. I request one more week to complete everything. I sincerely hope I can complete developing the UI for both programs by March 2nd.

UBCD
I have downloaded the UBCD image and made a bootable USB drive using it. Strangely, the DEBUG program wasn't included in the image. I have downloaded it separately, but haven't tried using it yet. Will post results after trying it out.

2011-02-21

We had a long discussion with the professor today on whether to continue formal classes (as before) or to have just lectures and no assignment/webpage evaluation. It's clear that I am finding it hard to cope up with the demands of the course in its present form. I haven't updated my webpage in the last six days and I don't have any excuses for that. Also, I haven't made any significant progress with the first two assignments yet and haven't yet started with the vowel detection program.

The professor told us this will only become worse as the amount of effort this course (as well as other courses) demands is only going to increase. He also suggested that the second method (No evaluations) would be much easier for us.

I know that it is not fair on my part to expect him to put a lot of effort in tracking my assignments and webpages, only to find that I've not done anything for almost a week. But I would still prefer the hard way, since I think this course has taught me to be systematic (at least a bit). Though I have my own share of faults, I have worked sincerely for this course and will continue to do so.

An unfortunate event in class (a cellphone that felt the need to make a noise at the worst possible moment!) triggered this discussion. I used to bring a phone to class too, but I think I won't from now on.

If Friday's (2011-02-18) class, the professor had given us the following two options on how to continue the course:
  1. Talk about IBM-PC as an excellent hardware platform
  2. Actual issues in Computer Architecture
I was one of the people who initially preferred option 1, since most of the things that were discussed in recent classes were new to me. I didn't want to skip ahead to advanced topics without having a solid base to build on. I changed my mind today, after more discussions with the rest of the class and I think I can learn more if we go with option 2. The things I don't know, I will either clarify with my classmates or look up from reference material.

2011-02-15


The updated HTML Entity Search program (v0.3) is uploaded.
Enhancements in this version:
  • Fixed missing entity bugs. It was a very interesting problem. My test HTML file contained a unicode character (ΓΏ), which C somehow identified as the EOF character and stopped processing my file! In the process of trying to locate the issue, I even rewrote the program without using regular expressions!
  • Valid and invalid entities written to separate files. Reason for rejection is also included along with the invalid entities.
I will be showing this in tomorrow's lab session. The source is available here.


I have not made much progress with the Quadratic Equation Solver. Based on the last review, my program now accepts leading spaces silently, but I just thought about trailing spaces and it asks the user for confirmation if trailing spaces are found. I must fix that also. I'm learning ncurses and have tried creating window frames, moving them around etc but I haven't yet figured out a good way to create and editable text box. I will keep working on it.

2011-02-14

Discussions updated for 2011-02-11 and 2011-02-14.

I had planned to work on the assignments during the weekend, but couldn't, since I was down with fever. So I will not be able to meet the promised dead-line of 2011-02-16. I will try to finish them by 2011-02-23.

2011-02-10

I had made some slight changes to the website, but had not logged it.
  • Left aligned the content.
  • Changed the width specification from percentage to fixed (800px) so that the text position does not change when the user resizes the window. I had focussed on the layout and gave the percentage specification so that the page *looks* the same even if the user resizes the window. But after talking to the professor, I realized that it is important that the text position doesn't change when the window is resized.
  • Removed hyperlink linking to the current page itself.
Discussions updated for 2011-02-04 and 2011-02-07.

2011-02-09

The fourth revision of the Quadratic Equation Solver (v0.3) is uploaded.
Enhancements in this version:
  • Accepts spaces between sign and number.
  • Tells user the number of guaranteed digits of precision.
The issues listed on 2011-02-01 are still present. The source is available here.

The second version of the HTML Entity Search program (v0.1) is uploaded.
Enhancements in this version:
  • Matches entities with a carriage return after the &.
  • Added some logic to handle entities which occur at the 100-character border
The code is still buggy and I have confirmed that it misses some entities. I'm debugging it presently. Also, the program still matches any string that fits the pattern. I've thought about the hash table design but not yet started coding. The source is available here.

I checked some of the machines in DCF and none of them had the PCRE development libraries installed. So I'll need my laptop for compiling and running the HTML Entity Search program.

Edit 1: 07:30
The third version of the HTML Entity Search program (v0.2) is uploaded.
Enhancements in this version:
  • Reports valid and invalid entity matches by searching a hash table.
Some bugs still seem to be present which results in some entities being missed. The source is available here.

Edit 2: 20:00
Updated discussions with input validation requirements and some floating point 'gotcha's. Read it here.

2011-02-08

I've not been updating the logs as frequent as I would've liked... It's been almost a week since I touched any page, which I think, is too long. Looking back, I can say I was tied up with work from other courses and had to make an unplanned trip home, etc. But more than any of that, I think I need to manage my time better. I will try to keep updating frequently, at least twice a week, as we promised.

HTML Entity Finder: Though the professor told me that it would be difficult to match the entities using regular expressions, I am still giving it a try. I've modified the regular expression so that it matches carriage returns between the & and other characters.

&[\n\r]*[a-z|A-Z]+| #[0-9]+;

This seems to be working for a few test files that I used. My program still matches all strings that match the pattern and not valid entities alone. I heard from my friends that the professor suggested using a hash table. I've looked at a few examples online and will start working on it today.

Edit 1: 14:40
Just found out that the above expression matches strings of the form '#123;' instead of '&#123'. The following expression should fix that:

&([\n\r]*)([a-z|A-Z]+|#[0-9]+);

Quadratic Equation Solver: I saw Kashyap's program and it looks really good. I've heard about ncurses but never used it. I will try to use ncurses so that the user has the ability to edit any wrong input provided.
I hope to complete both these programs by the next Wednesday (2011-02-16). It's a tough target, but I will try to meet it.

2011-02-02

I was able to use PCRE in C after installing the necessary development libraries. The program is complete and can be viewed here.

I have used the sample program (pcredemo.c) provided along with the PCRE library as the base for my code. Most of the error handling is taken directly from this file.

Edit 1: 16:30
Issues with Quadratic Equation Solver The Quadratic Equation Solver was tested with case where b*b >> 4*a*c. The program generated non-zero roots and a warning that the solution may not be accurate, but no indication on the guaranteed digits of precision.
The program should report the guaranteed number of digits or precision too, so that the user can use the results appropriately. I don't have any idea on how to do this yet, but will try to find out.
Also, my program treats spaces between the sign and number as an invalid entry. I plan to modify the program so that spaces between the sign and number are accepted.

Issues with HTML Entity Finder The program currently does not match entities if a carriage return is inserted in between. Such entities are also to be detected. I compared the output of my program with Anupama's program and found a difference in the total number of entities reported for the same file. I think that my program misses some entities because of the way input is read (Lines greater than 98 characters are split up. So, if an entity spans across the split, it won't be detected). I am yet to confirm this, but am pretty sure that is the problem.

2011-02-01

A new revision of the Quadratic Equation Solver has been uploaded here.
I am now working on the program which locates HTML entities in an input file. I still haven't figured out how to use regular expression search in C and if nothing else works out, I might try writing it in C++.

2011-01-31

I still don't have a working Quadratic Equation Solver. I now feel that I should have spent more time in making the program give right results (Handling b*b > > 4*a*c) rather than spending so much time on input validation with no results!

I have started thinking about how to write a program to match all entities in an HTML document. I think using a regular expression search would be easier than writing the code to search the input file. I still have not figured out how to do a regular expression search in c. It is, however, possible in C++ with the Boost library function regex_search. Also, I think the following expression will match all entities:

&(([a-z|A-Z]+)|(#[0-9]+));

I found that there is something called 'Perl Compatible Regular Expressions (PCRE)', which can be used for regular expression matching. I couldn't, however, find any direct examples for PCRE in c. Guess I'll have to do it the hard way - figure out from the manpage!

This is the second time I'm writing a regular expression (For a practical application. I have seen and written regular expressions for the MCCS course last semester) and so am probably wrong. Have to make sure that this expression matches all entities and only the entities.

2011-01-30

I had always considered myself proficient in writing and thought I paid a lot of attention to detail. After reading "The Elements of Style", I realised that not only was I ignorant about certain rules, but some rules I did follow were wrong too!
  • Usage of comma - In a series of three or more terms with a single 'and', a comma is required before the 'and' too, which I didn't know.
  • Due to - I did not know that 'due to' is not equivalent to 'because of'.
  • Usage of 'however' - I did not know that it is not proper to begin a sentence with 'however' unless the intended meaning is 'whatever way' or 'whatever extent'.
  • Usage of 'lesser' and 'fewer' - I remember being told the difference between these two while in school, but had forgotten. Now I know!
  • Thanks in advance - I used to end my e-mails with 'Thanks in advance', but now I understand that it can be interpreted in another way, orthogonal to the intended meaning!
  • Maintaining the same tense - This is one rule I always violate, even though I know the rule. While summarising technical reports, I tend to mix tenses. If I try to correct the issue, the statements seem to lose their charm (which is probably just a feeling). I must work on this and try to get summaries worded properly.
  • Rules for splitting words at end of line were also useful, though such things are now handled by word processing software.
There are, however, some points in the book that I still don't agree with completely:
  • Always use positive statements - The book recommends saying

    He thought the study of Latin useless.

    instead of

    He did not think that studying Latin was much use.

    While this would be fine for a literary review or in an essay, I don't think usage of such blunt statements are proper in technical reports or in other documents where delicate statements are required (Diplomatic communications?). Similarly, I feel that statements in passive voice sound better in some technical reports. For example,

    The data is cleared subject to the closure of actions 6.8 and 6.9.

    sounds better than

    We are clearing the data subject to the closure of actions 6.8 and 6.9.

  • To-day / To-night - I think rules like these do not apply now as the English language is constantly evolving.

2011-01-23

I was traveling from Friday evening till now and I'm just starting to write the log today. However, I did think about the lecture on 2011-01-21. It was certainly different from the usual classes and a lot more topics were covered in more or less a monologue. I would definitely prefer the earlier mode of teaching and I hope we (the class) will all be able to meet the expectations of the professor.

In any case, the class was interesting because I could relate it to things that I've heard about / used at work. The executable which runs on the onboard computers, after being linked, will contain a code segment, data segment and bss segment. So if a particular variable is to be made dynamically initialisable, we would need to move it from the bss segment to the data segment (So that it is allocated in RAM).

Long back, M68k processors were in use, but later replaced by more capable processors. I have not worked with the 68k or PDP-11, but just read in Wikipedia that the code is really compact for 68k, due to the large instruction width and larger number of addressing modes.

I have also worked with an Analog Devices microcontroller, ADSP-2101 and it has separate data and program memories. But I think the memories are split up for improved execution speed rather than preventing execution of data memory contents or arithmetic with program memory contents. Even with separate memories, it is possible to use data in program memory locations for ALU operations.

2011-01-20

Made some changes to the web pages:
  • Removed frames as suggested by the professor.
  • Tweaked the colors and fonts as the old style looked too dull. The original mono-space font was intended for portability but it didn't look too good.
  • Added a new "Home" page instead of going to the "About" page directly.
  • Tried CSS Style Sheets for the first time and added styles to tables and hyper links.

2011-01-18

Added some basic validation to the feedback form. There's still lots of work to do on that. I spent some time today on working on the Quadratic Equation solving program trying to write it without returning a pointer from the function. The updated code is available in the Source page. The following questions were posed in the class for us to think about:

How do you measure the time taken for a program to run?
If measurements accurate to the second are sufficient, the time() function can be used. It returns the number of seconds elapsed since January 1, 1970. We can get the timestamps at the beginning and end of main () and subtract the times to get the total time taken.
If a higher resolution is required, we can use the gettimeofday() function which returns the time with micro-second precision.
I think the gettimeofday() function is not available under Windows and so code which uses it will not be portable.

2011-01-17

All the pages in the site are active now. I've used PHPMailer along with my GMail account for the feedback form. I have not added any validation yet and anybody can send messages as of now. I plan to add some field validations and IP address logging. I'm yet to enter the logs / discussion for the first three lectures. The following questions were posed in the class for us to think about:

Design a mechanism to find out if overflow has taken place in an add operation
I could think of a general way to detect overflow - That is, if the carry in to sign bit and carry out from sign bit are different, then there is an overflow. But as discussed in class, checking this after each add operation would have a negative impact on performance. I don't have any good ideas yet for overflow detection specific to the ADD operation. It could be detected without additional code if the ALU itself has hardware for XORing the carry in/carry out from the sign bit, but I am not sure if that is how processors detect overflow. I am sure that some such circuitry exists, since I've seen that many processors have overflow detection and options to raise an exception / clamp the value to maximum etc.

How do we deal with very large numbers in practice, when the word length inside computers is limited?
It is not practicable to use native data types to store real world data because the data might be too large to fit in that data type. One way of handling very large numbers is to store the numbers as character arrays and then define custom mathematical operators on them. For example, if we require addition of very large numbers, we can store them as strings (say n1 and n2 of length l1 and l2) and use the following procedure to find the sum:

Find the smaller number, say n2
Loop from i=l2-1 till 0

Get the ith digit of n2 and (i-l2+l1)th digit of n1 and convert them to the corresponding integers
Add the integer values with carry from previous step
Find and store store sum and carry from the result

Loop from i=(l1-l2-1) to 0

Get the ith digit from n1 and convert to corresponding integer Add the carry from previous step
Find and store store sum and carry from the result

If there is a carry from the last addition, prepend the carry to the result

I got the above logic from the discussions on this forum. Here, for converting a character of the string to the corresponding integer, all we need to do is subtract the ASCII code corresponding to '0' from the code of that character.
This method can be used to find the sum of integers of arbitrary length using only simple int operations. There are many third party libraries available for similar applications and the GNU Multiple Precision Arithmetic Library is one of them. From the few pages of documentation that I read on the GMP, I understand that they also use string representation for storing the numbers of arbitrary precision.

2011-01-14

The following questions were posed in the class for us to think about:

How many flags are affected in a microprocessor when a compare operation is performed? Which are the affected flags?
Most processors have a sign/negative flag and a zero flag and these two flags will be modified during the compare operation to store the result. For instance, in the ADSP 21060 microcontroller, on executing COMP (Rx, Ry) instruction, the following flags are modified:

Flag Description Effect of Compare operation
AZ Zero Flag Set if the contents of Rx and Ry are equal
AN Negative Flag Set if the Rx < Ry, cleared otherwise

The data sheet for the microcontroller lists the flags affected after each instruction.
In x86 processors, the following flags are modified for the CMP instruction:

Flag Description Effect of Compare operation
ZF Zero Flag Set if the OP1 and OP2 are equal
SF Sign Flag Set if the OP1 < OP2, cleared otherwise

The following flags are also modified since the comparison is actually done using a subtract instruction:
Flag Description Effect of Compare operation
OF Overflow Flag Set if an overflow was generated during signed subtract operation
CF Carry Flag Set if an overflow was generated during unsigned subtract operation
AF Adjust Flag Set if a carry was generated from the lower nibble during subtraction
PF Parity Flag Set if the number of ones in the result is even
Details on the x86 instructions were taken from Wikipedia and here.

How many characters do you typically find in a line of a book?
Most of the textbooks I looked at have between 70 and 80 characters per line. A typical line in my notebook has 50 characters and a typical document with my preferred font (Tahoma, size 11) has around 60 characters per line.

What are the additional codes required during text editing?
In addition to the alphanumeric keys, we would require the following special key codes for easy text editing:
  • Arrow keycodes for navigating the document
  • Delete
  • Newline
  • Carriage return
  • Tab

2011-01-12

The following questions were posed in the class for us to think about:
Why do we need both the single quote (') and the double quote (") characters?
How many symbols do we actually need to interact with the computer? How are they represented inside the computer
The symbols are coded in some format for internal representation in the computer. How are the symbols mapped to the code? Is it arbitrary?

2011-01-05

The following questions were posed in the class for us to think about:
Why is the Stack a deadly feature?

2011-01-04

The following questions were posed in the class for us to think about:
Is there a storage mechanism other than flip-flop?
Why shouldn't we write self-changing code in modern computers?
What is the typical access time for a modern memory chip?
List a few problems requiring high CPU usage