2008-11-08

Google Desktop Gadget: On-Screen Ruler

The On-Screen Ruler is a simple ruler gadget that helps you to align, and also measure the distance in pixels between your desktop elements (other gadgets perhaps). It has major and minor markings just like a conventional ruler, and you can click on its solid handle to rotate it (to use it horizontally or vertically.)



Options for this gadget include:

Ruler Thickness - Thickness of the ruler in pixels. (Acceptable values are between 50 pixels and 150 pixels.)

Ruler Length - Length of the ruler in pixels. (Acceptable values are between 200 pixels and the screen width.)

Division Size - Number of pixels between each division marking. (Acceptable values are between 10 and 100.)

Major Marking Interval - Number of divisions between each major division marking. (Acceptable values are between 5 and 20.)

Colour - Colour of the ruler and its markings. Select from:
    Black
    White

This gadget is hosted on Google Code and licensed under the GNU General Public License v3.

2008-10-04

Google Desktop Gadget: Flickr Photo Frame

i have finally got my very first Google Desktop Gadget submission accepted and listed on the Google Desktop Gadgets listing page.

The Flickr Photo Frame is a simple digital photo frame gadget that randomly displays a Flickr user's photographs.



The main reason for writing this gadget in the first place, is that i wanted to randomly display my Flickr photos on my desktop. There are already a few similar gadgets available, but i thought i would implement a new one with options that i thought would be useful. Another strong objective, of course, is to use this opportunity to learn how to write a Google Desktop Gadget.

The digital photo frame is a simple one, with the following features:

- Change of displayed photograph after a configurable interval (with optional fade transition).
- Display of the photograph title when hovering the mouse cursor over the photograph.
- Opening of the original Flickr page for that photograph in default browser when clicking on it.

The options available for the gadget are:

Flickr User Name - Name of Flickr user whose photos are to be displayed.

Size - Size of the photograph to be displayed. Select from:
    Thumbnail - Maximum of 100 pixels on each side
    Small - Maximum of 240 pixels on each side
    Medium - Maximum of 500 pixels on each side

Refresh Interval - Number of seconds between each displayed photograph change. (Acceptable values are between 10 seconds and 3600 seconds.)

Fade Transition - Whether to use fade-in transition for displayed photograph change.

Photo Alignment - Where the displayed photo will align to. Useful if the gadget is to be placed along an edge of the desktop. (Because a photograph usually will not be of a perfect square dimension.)

All comments/suggestions/criticisms on how to improve this gadget are welcome! Furthermore, this project is hosted on Google Code and licensed under the GNU General Public License v3, so feel free to have a look at the source code, build better gadgets on top of it, or contribute back to this project.

Do try out the Flickr Photo Frame and let me know what you think.

2008-06-17

JavaScrpt: parseInt - Remember the Radix

When using the JavaScript parseInt global function, it is a good practice to always specify the radix (or number base), which is the optional second argument for that function. The radix that you would probably use most frequently, 10, is actually the default value in most cases, if that second argument is not specified. However, in a few special cases, it does not work that way, as i had found out after spending some debugging time.

i had written a JavaScript function that takes a date string (e.g. "20080617" representing 17th June, 2008) into separate integer variables representing the day, month and year by making use of the String.substr function (to split the single string into "2008", "06" and "17"), and the parseInt function (to parse each string component into an integer value). So, in this example ("20080617"), we would get the value 2008 for the year variable, 6 for the month variable, and 17 for the day variable.

However, things did not run according to plan, when i tried to parse "20080509" (9th May, 2008) using the function that i had just written. The results of passing that string through the function, were the values of 2008, 5, and 0 for the year, month and day variables respectively. So the value assigned to the day variable was obviously wrong.

The reason for such a result would have been clear (or in fact, i would not have made this mistake) if i had read the "specifications" for parseInt more carefully. From the JavaScript Kit's JavaScript reference, the parseInt function

Parses any string and returns the first valid number (integer) it encounters;


AND

Supports an optional second "radix" parameter to specify the base of the number to be parsed. Without this parameter present, parseInt assumes any number that begins with "0x" to be radix 16, "0" to be radix 8, and any other number to be radix 10.


So, in our second (and negative) example above, what happened was:

The string "2008" was parsed using the default base-10, returning the integer value 2008.

The string "05" was parsed using base-8 (because of the "0" prefix), returning the base-10 integer value 5 (because 5 in base-8 is also 5 in base-10).

The string "09" was parsed using base-8 (because of the "0" prefix). When the parser encountered the character "9", it stopped the parsing at that point (because "9" is not a valid character in the base-8 context). Hence, it returned the integer value of the "0" character, which is 0.

2008-05-03

audio-convert: Mistaking Wave Files For WMA Files

audio-convert (http://savannah.nongnu.org/projects/audio-convert) is a handy little bash script that simplifies the conversion between several audio file types, making use of various well-known codec libraries.

The only issue i faced when using the script, was that it wrongly identified my wave files as WMA files. That is because one of the checks for whether a file is of the WMA format involved looking for the word "Microsoft" in the file brief (file -b filename.wav). Somehow, my wave file has a file brief of "RIFF (little-endian) data, WAVE audio, Microsoft PCM, 16 bit, stereo 44100 Hz". That coupled with the fact that i do not have mplayer installed (needed to decode WMA files) caused the script to show a warning message and exit.

A couple of simple workarounds for this:

1. If, like me, you do not work with WMA files, you can just comment out the part of the script that checks whether mplayer is installed (lines 1270 to 1281).

2. A better way would be to modify the WMA file detection in line 455 to just check if the filename ends in ".wma" (case-insensitive). Then, when you do have to work with an input WMA file, just ensure that it has a WMA file extension.

This issue, as well as both workarounds, have been filed as bug #23141 (http://savannah.nongnu.org/bugs/index.php?23141).

2008-04-17

JavaScript Reminder - Always Declare Your Variables

Having been developing in Java for some time, the practice of declaring variables (e.g. writing var n = 0; instead of just n = 0; for the first use of the variable n) comes quite naturally. Naturally, that is, except for when initialising a for-loop. Somehow, after getting into trouble with the browser a few times for writing code like for (int i = 0; i < n; i++) out of habit, i gradually fell into writing just for (i = 0; i < n; i++) instead, omitting the int (or var) altogether. After spending quite an amount of time looking at a rather nasty bug earlier, i would just like to remind everyone the importance of the var.

This is a fragment of what i wrote:

function outerFunction()
{
    for (i = 0; someCondition; i++)
    {
        // Do Something
    }
}

function MyObject()
{
    var object = new Object();
    
    object.innerFunction = function()
    {
        // Do Something
        
        for (i = 0; i < n; i++)
        {
            outerFunction();
            
            // Do Something
        }
    };
    
    return object;
}


Upon calling MyObject().innerFunction();, my browser froze, and after a while, came back with a suggestion to terminate the script. Peppering the code with plenty of alerts led to the discovery that the value of i in the inner function's loop would "reset" back to a smaller value after the call to the outer function, and hence never reaching the value of n.

More time spent staring at the code later, i realised that the problem, and hence the solution, was just staring me in the face:

function outerFunction()
{
    for (var i = 0; someCondition; i++)
    {
        // Do Something
    }
}

function MyObject()
{
    var object = new Object();
    
    object.innerFunction = function()
    {
        // Do Something
        
        for (var i = 0; i < n; i++)
        {
            outerFunction();
            
            // Do Something
        }
    };
    
    return object;
}


Remember your vars!

2008-04-10

JavaScript - Absolute Position of an Element

i was in the midst of some HTML coding when i needed to be able to find out the absolute position of an element using JavaScript and HTML DOM. After looking through some online and offline references, and trying out some possibilities, i found that there is no element method or property that would give me the values that i needed. (The element.style.top and element.style.left attributes will only return what has been set beforehand, either through script or CSS, and even so, will only be useful for this purpose if the element.style.position attribute has been set to absolute.)

After some searching on the net, i came across a great piece of code (from QuirksMode.org) that does exactly what i needed:

function findPos(obj)
{
    var curleft = curtop = 0;
    
    if (obj.offsetParent)
    {
        do
        {
            curleft += obj.offsetLeft;
            curtop += obj.offsetTop;
        }
        while (obj = obj.offsetParent);
    }
    return [curleft, curtop];
}


The site where the code snippet was taken from - QuirksMode.org - provides detailed explanations on why it works that way. To use the function above to, for example, find the absolute position of an element called div, you would just have to call

var pos = findPos(div);

The returned variable, pos, would then be an array whose first element (pos[0]) is the number of pixels between the div element and the left edge of the page, and second variable (pos[1]) is the number of pixels from the top of the page.

2008-04-02

Tuning Java Garbage Collection

The Java virtual machine automatically handles garbage collection of objects that are no longer referenced, and you would normally not have to change or tweak the default garbage collection settings. That is, unless you are dealing with a long-running application (e.g. a web application), or if performance is of great importance. For those cases, i have found a few basic steps which provide a good starting point in tweaking those settings.

In order to tune the garbage collector, you would first have to know how it is currently behaving and performing. This can be achieved enabling detailed garbage collection logging, and then monitoring the log file. You can get a useful garbage collection log output by appending the options

-verbose:gc -Xloggc:gc.log -XX:+PrintGCTimeStamps -XX:+PrintGCDetails

to the java command when starting up your Java application. So your application start up command would end up looking something like:

java -classpath some/path:some_jarfile.jar -verbose:gc -Xloggc:gc.log -XX:+PrintGCTimeStamps -XX:+PrintGCDetails your.main.Application

After your application has run for some time, you would see lines like these in the garbage collection log file (gc.log in the example):

3779.257: [GC 3779.257: [DefNew: 308160K->9280K(308160K), 0.3048983 secs] 700726K->425258K(2087872K), 0.3050826 secs]

and

3788.365: [Full GC 3788.366: [Tenured: 415978K->237898K(1779712K), 3.5419520 secs] 689157K->237898K(2087872K), [Perm : 81663K->81663K(81664K)], 3.5421538 secs]

A very basic explanation of those two lines are as follows:

Line 1: A minor collection took place at time T + 3779.259 seconds (3779.259 seconds after the application was started up). Memory usage in the young space was brought down from 308,160 KB to 9,280 KB while on the whole, 275,468 KB was recovered from the Java heap (from 700,726 KB to 425,258 KB). The collection took 0.3050826 seconds.

Line 2: A full collection took place at time T + 3788.365 seconds. 178,080 KB was recovered from the tenured space, 451,259 KB from the Java heap overall. The collection took 3.5421538 seconds.

Now, with that logging in place, it is possible to spot a few basic performance-related issues with memory and garbage collection.

Memory Leaks

If the figure representing the total used memory after collection (e.g. 425,258 KB in the first case) continuously increases for a relatively long amount of time, a memory leak situation may be occurring. Profiling the application for a period of time and monitoring objects which increase endlessly in count and size could help tremendously in hunting down the source of the leak. From my personal experience, memory leaks are most commonly caused by continuously adding objects to a static or long living collection (e.g. a cache) and neglecting to remove them.

Barely Sufficient Heap Size

If the figure representing the total used memory after collection (e.g. 425,258 KB in the first case) is close to the total available heap size (e.g. 2,087,872 KB in the first case) before a significant amount of time has even passed, the set heap size may be insufficient and may need to be increased. If heap usage continues to increase with time, trashing may eventually take place (i.e. the application spends almost all its time doing only garbage collection), with the occasional java.lang.OutOfMemoryError causing havoc. Even in the case where memory usage is already stable, application performance may still benefit from increasing the heap size, as it would then perform garbage collection at less regular intervals.

Young Generation Guarantee

Typically, most of the collections occurring should by minor collections (first line in example above). If full collections (second line) happen for the majority of the time, memory usage due to long living objects may too high and the young generation guarantee can never be met. One possible solution for this would be to set the size of the new space to a smaller number (thus allowing more space in the tenured generation). This is likely to happen in applications which rely heavily on caching (for performance, somewhat ironically), and the symptoms can be easily spotted from the garbage collection logs. A simple tweak in the new size setting usually results in significant performance improvements.

Garbage Collection Strategy

Depending on factors such as the physical configuration of your server (e.g. number of processors), the type of application (e.g. online transaction or batch processing), and the memory usage pattern (e.g. heavy caching), it may even be worthwhile to explore alternative garbage collection strategies. In one of my previous projects, an online transaction application running on a server with multiple processors, we experienced a significant performance boost by switching from the default strategy to the concurrent low pause collector with parallel minor collections.

Reference: http://java.sun.com/docs/hotspot/gc1.4.2/

2008-03-12

Java Library to Search in PDF Files

i had been looking around for an open source Java library that would facilitate searching in PDF files, when i discovered the solution of using a combination of PDFBox and Apache Lucene.

PDFBox is an open source Java PDF library for working with PDF documents. It allows creation of new PDF documents, manipulation of existing documents, and - most importantly for this purpose - the ability to extract content from documents.

Apache Lucene, on the other hand, provides Java-based indexing and search technology.

It is not hard to see, then, that these two libraries can be used in combination for PDF searching in Java; PDFBox can be used extract text from PDF documents, and Lucene can be used to search through the extracted text. In actual fact, it is easier than that, as PDFBox provides an utility that enables simple integration with Lucene. This utility is the org.pdfbox.searchengine.lucene.LucenePDFDocument class, which contains static methods for obtaining a Lucene document from a PDF file. The document can then be added to a Lucene index, which can be searched with an index searcher.

A simple implementation that determines whether a specified term is present in a PDF file:

import java.io.File;
import java.io.IOException;

import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.index.Term;
import org.apache.lucene.search.Hits;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.Query;
import org.apache.lucene.search.TermQuery;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.RAMDirectory;
import org.pdfbox.searchengine.lucene.LucenePDFDocument;

public final class SimplePdfSearch
{
    private static final String PDF_FILE_PATH = "/path/to/pdffile.pdf";
    private static final String SEARCH_TERM = "searchterm";
    
    public static final void main(String[] args) throws IOException
    {
        Directory directory = null;
        
        try
        {
            File pdfFile = new File(PDF_FILE_PATH);
            Document document = LucenePDFDocument.getDocument(pdfFile);
            
            directory = new RAMDirectory();
            
            IndexWriter indexWriter = null;
            
            try
            {
                Analyzer analyzer = new StandardAnalyzer();
                indexWriter = new IndexWriter(directory, analyzer, true);
                
                indexWriter.addDocument(document);
            }
            finally
            {
                if (indexWriter != null)
                {
                    try
                    {
                        indexWriter.close();
                    }
                    catch (IOException ignore)
                    {
                        // Ignore
                    }
                    
                    indexWriter = null;
                }
            }
            
            IndexSearcher indexSearcher = null;
            
            try
            {
                indexSearcher = new IndexSearcher(directory);
                
                Term term = new Term("contents", SEARCH_TERM);
                Query query = new TermQuery(term);
                
                Hits hits = indexSearcher.search(query);
                
                System.out.println((hits.length() != 0) ? "Found" : "Not Found");
            }
            finally
            {
                if (indexSearcher != null)
                {
                    try
                    {
                        indexSearcher.close();
                    }
                    catch (IOException ignore)
                    {
                        // Ignore
                    }
                    
                    indexSearcher = null;
                }
            }
        }
        finally
        {
            if (directory != null)
            {
                try
                {
                    directory.close();
                }
                catch (IOException ignore)
                {
                    // Ignore
                }
                
                directory = null;
            }
        }
    }
}


This code fragment demonstrates only the basic concept, and is not very useful per se, but it is not difficult to extend it to do some powerful searches by utilising the capabilities of Lucene. For example, different queries such as a phrase query or a fuzzy query can be used instead of a term query (see org.apache.lucene.search.Query), and a highlighter object can be used to extract the text fragments that contain the found term (see org.apache.lucene.search.highlight.Highlighter).

2008-03-07

Previous Version

Archived posts from the previous version of this blog can be found at http://hello-world-1-0.blogspot.com/.