Platinum Solutions Corporate Website

Sanjay Agravat's blog

Fix for date_select in Rails 2.0.2

If you are using Rails 2.0.2 you may notice that you get an Internal Server Error when you do something like this:

<%# fields_for "child[]", Interest.new do |f|%>
<%= f.text_field :name %>
<%= f.date_select :birth %>
<%# end %>

I'm doing something similar to that example and I found out there is a patch scheduled for rails 2.0.3. If you can't wait for the fix then you can add the patch to the stable code base yourself (unfortunately, the fix isn't available in "edge rails" yet).

Dual Boot Ubuntu 7.04 and Windows XP

I've been a user of Debian since the "Potato" release in 2000. Ubuntu is a Debian based distro that focuses on a platform that should "just work" (TM). Ubuntu works great with my Dell D820 Laptop and had no problems with my sound card, video driver, and power management. I was definitely a happy camper so I thought I'd share how to create a dual boot environment with Ubuntu + XP.

Making a dual boot of Ubuntu with Windows XP on your PC is ridiculously easy. I'll describe the steps for the installation and also what you should do post-install to give you a more usable system.

String Matching and Finite State Machines

String Searching is a subject of both theoretical and practical interest in computer science.  String-matching is a very important subject in the wider domain of text processing. String-matching algorithms are basic components used in implementations of practical software existing under most operating systems. Moreover, they emphasize programming methods that serve as paradigms in other fields of computer science (system or software design). Finally, they also play an important role in theoretical computer science by providing challenging problems and can be extended to areas such as

    *          pattern matching/recognition
    *          molecular biology, comparative genomics, …
    *          information retrieval
    *          data/text mining
    *          data/text compression, coding, encryption
    *          string processing in large databases

A string is a sequence of symbols drawn from some well defined set called the alphabet.

 Examples of alphabets include:

    *          ASCII code, Unicode
    *          binary alphabet {0,1}
    *          System of DNA base-pairs {A,C,G,T}
    *          Latin, Greek, Chinese alphabet

Examples of strings:

    *          Java/C/ADA programs, HTML/XML documents
    *          DNA sequences
    *          image/video/audio files

String-matching consists of finding an exact match, or more generally, a pattern in a text.  We shall examine this using Finite Automata as an implementation of regular expression matching.

Regular Expression Matching Algorithms

There are two fundamental different types of regex engines: one called DFA (Deterministic Finite Automata) and one called NFA (Non-Deterministic Finite Automata).  Before a discussion of the algorithms are made, we shall first cover some background information about Finite State Machines.
Mathematical Background

The theory of Finite Automata can be classified into several categories, but the one we need for the sake of regular expression recognition is the notion of determinism. Something is deterministic when it involves no chance - everything is known and can be prescribed and simulated beforehand. On the other hand, non-determinism is about chance and probabilities. It is commonly defined as "A property of a computation which may have more than one result".

An NFA N consists of:

   1.           A finite set I of input symbols
   2.           A finite set S of states
   3.           A next-state function f from S x I into P(S)
   4.           A subset Q of S of accepting states
   5.           An initial state s0 from S

Denoted as N(I, S, f, Q, s0)

A DFA is a special case of NFA in which:

   1.        There are no Epsilon transitions
   2.       All transitions leading away from a state are done on different input characters
   3.       All states must have a valid transition on an input character

 

Regex-Directed Versus Text-Directed

The NFA and DFA reflect fundamental differences in algorithms for applying a regular expression to a string.  The NFA is “regex-directed” and the DFA is “text-directed”.

During the course of an NFA match, the same character of the target might be checked by many different parts of the regex.  Even if a sub-expression can match, it might be applied again and again as it works in concert with the rest of the regex to find a match.  This is due to Backtracking mechanism of a NFA match.

The implications of NFA usage are that the details of how a match is made are very important.  The writer of a regex can exercise more control simply by changing how the regex is written.

A DFA engine behaves oppositely.  The engine keeps track of all matches simultaneously.  There could be multiple ways to achieve the same result, but since the DFA keeps track of all them simultaneously.  The DFA is essentially a “normalized” NFA.  It is due to this normalization that the DFA gains in efficiency over NFA with a slightly added level of complexity in implementation.

While a DFA may be more efficient, an NFA engine can support many things that a DFA cannot.  Among them are:

    *          Capturing text matched by a parenthesized sub-expression
    *          After-match information indicating where in the matched text each parenthesized sub-expression matched
    *          Look-around, and other complex zero-width assertions
    *          Possessive quantifiers and atomic grouping

 
Example Implementation of an NFA

Outside of the string matching world, NFA’s are used in theory and in practice for several concepts and physical machines that make use of this “abstract computer machine”.  Attached  is source code for you to examine.  Note that common regex functionality such as Kleene Closures, Unions, and Intersections have not been implemented; and construction of the NFA has to be done manually.

New features of JDK 5.0 - Generics

My favorite new feature of JDK 5.0 is Generics. Generics evolved from Templates in C++, but they are not entirely the same.  I will focus on the different applications of Generics and provide a brief explanation of how they are implemented using erasures.

 Introduction

Generics allow you to abstract over types.  The simplest example of demonstrating a Generic at work is with Collections.  The old way of getting an item out of a List looked something like this:

List l = new ArrayList();
l.add(someString);
l.add(anotherString);
for (Iterator listIterator = l.iterator(); listIterator.hasNext(); ) {
    String s = (String)listIterator.next();
}

This a trivial example of how we iterate through a list and how we have to cast the object we get out of the list.  Well, if we already know the type of object we are inserting into the list, then surely there must be a more precise way of extracting that object from your list.  This is part of the problem that Generics solve, here is what the same code looks like now with Generics (and the new for loop feature of JDK 5.0):

New features of JDK 5.0

The JDK 5.0 release has some new features and much improved enhancements to the JDK that developers should be happy and eager to take advantage of. I will attempt to showcase these improvements over the course of the next few weeks.

This week's new feature is Metadata (Annotations).
quote from Sun:

This language feature lets you avoid writing boilerplate code under many circumstances by enabling tools to generate it from annotations in the source code. This leads to a "declarative" programming style where the programmer says what should be done and tools emit the code to do it. Also it eliminates the need for maintaining "side files" that must be kept up to date with changes in source files. Instead the information can be maintained in the source file. Refer to JSR 175.