Showing posts with label Software Development. Show all posts
Showing posts with label Software Development. Show all posts

Tuesday, March 17, 2009

Apple iPhone OS 3.0 Announcement summary

Here are highlights of what was announced for the iPhone OS 3.0 release early this afternoon from Apple:
  • Cut and paste: it was worth the wait, the touch interaction to do this looks very cool (see picture at right). Works across applications and does undo.
  • Multimedia messaging: you can attach a picture to a text message
  • Ability to choose a group of photos and send them in a single email
  • Push email notification
  • Landscape mode text entry (so what)
  • Turn-by-turn GPS navigation.
  • Available in the summer. That's as detailed as it gets. No doubt will be linked to the new iPhone model coming out in July.
  • Virtually all of the new features will work with the original, pre-3G iPhone (exceptions: multimedia messaging and stereo bluetooth)
  • Peer-to-peer linkups between individual iPhones for games, file sharing, etc. This exists now with things like AirSharing and Holdem, but those companies probably rolled their own; now it's part of the API
  • API support for applications that connect to external devices. Demonstrations with medical devices were given (see pic at right). Medical applications of new technology are always a big win in corporate presentations; the real news here is that this will open up remoting of all sorts of sophisticated devices for music, video, information systems, anything you can imagine.
  • Ability to search in your emails on the server side, and search in your calendar items
  • Search your iPhone contents with Spotlight (well-known to Mac users)
  • The Sims 3 will run on the iPhone (see pic at right)

Friday, October 24, 2008

CloudCamp Chicago meeting Oct 21, 2008


Last Tuesday (Oct 21, 2008) was the Cloud Computing one-night conference "CloudCamp" hosted by Tech Cocktail. My company has some really time consuming web analytics tasks that take days to run, and we're exploring using GridGain to distribute the work over several servers, so it was a good chance for me to get an acquaintance with this field.

I've included a few photos from the event, that come from the conference's Flickr photogallery.

The meeting was alright. I didn't see anyone I knew there but had reasonably enjoyable small talk with mostly non-technical people during the drinking hour.

It wasn't a conference in any traditional sense of the word. There were no scheduled topic, no scheduled speakers. They did use the format used by O'reilly's "Foo Camps". They have a grid of sessions and meeting rooms on a whiteboard. All the squares are empty. Then they ask everybody who has a topic they are interested in to write it in a square on the board. Presto...you are now the moderator of that session.

I volunteered the topic "Software Engineering and Grid Computing". Eight really smart people showed up, including two physicists from Italy, two doctoral students, a guy from UBS, and a consultant from CohesiveFT, a Chicago company specializing in cloud computing.

Physics people have been doing grid computing for years, so they were levels above me. But interestingly, a lot of their problems have to do with resource sharing. There can be other research teams that also want to use the grid, and maybe they don't want the nodes installed with the same software you do, and the people with the biggest grants tend to win out.

The most interesting guy there was the consultant from CohesiveFT, Pat Kerpan. He had two pieces of memorable wisdom. (1) Rule of thumb: count on a 30% performance penalty imposed from the overhead of grid enabling your problem. (2) It's easier to bring the computing to the data than to bring the data to the computing.

He talked about the stuff their company uses for their clients called "Open Source Sun Hypervisor". This has an interface that allows you to trick out your nodes with whatever setup you want (e.g. pick and choose between java, tomcat, flavors of linux, struts, etc.) and get a multinode environment all set up in six minutes.

Several of the people spoke knowingly of "paravirtualization". Pat distinguished between problems that are "compute bound" vs. "data bound".

A few people referred to Hadoop. No one had ever heard of GridGain, but I don't think that Java development was strongly represented in that collection of people.

People have different aims in cloud computing. For a lot of people, they don't mind if a lot of virtual nodes are spread over one machine.

Virtualization was recommended as a convention even when you are doing one node per machine.

In many commercial applications, 4 virtual nodes per machine is typical.

Many people responded to my description of what we are trying to do at iCrossing with "why don't you just use Amazon's cloud computing"? To hear them describe it, Amazon gives you the flexibility to do whatever you want.

I could have attended some of the other sessions if I wanted to stay two more hours, but I split after mine. The other sessions were on pretty soft- or business-focussed topics. One guy led a session called “What color is your cloud?” There were two Microsoft people who found each other and made their own Microsoft-focussed session ("Cloud Computing in Windows 8 and SQL Server").

Wednesday, September 24, 2008

Job hunter tips: questions to ask employers

If you're a web software developer and in the market for a job right now, this list of questions to ask your potential employer could be helpful:

http://docs.google.com/Doc?id=dcntd65k_129dshbbncs

Now it's not a perfect list, since it's obviously skewed towards a open-source/java enterprise web developer. But if you find it useful, here it is! Good luck.

Friday, October 19, 2007

The SHARC Timbre Dataset v. 2.0: XML Format

SHARC is a dataset of musical timbre information that I collected by analyzing over 1300 orchestral musical instrument notes. Specifically, the information is amplitude and phase data from a selected steady-state portion of each note. The dataset is now available in XML format.

Some time ago, when I was a grad student, and while holding various fellowships after I got my PhD, I did research in music, human hearing and digital audio (see my publications). One of the projects I undertook was to compile a collection of information on musical instrument tones, which I called SHARC ("Sandell Harmonic Archive").

I've described SHARC in a few places before: in an article from 1991, and in the release notes from the original distribution. Briefly, though, what I did was this. I had a collection of CDs consisting of individually performed notes of all the standard instruments of the orchestra, one recording for each note in the respective instrument's playable range. For each note, I chose a middle portion of the recording, during the note's steady state, and performed a spectrum analysis. I saved the amplitudes and phases of all the harmonics of the pitch's fundamental frequency up to a ceiling of 10kHz.

In my first version of the distribution (which you can still download in compressed tar format), SHARC consisted of a series of files, one for each note that was analyzed, organized into directories by instrument. That was 1994; since then, XML has come into being and I've now released SHARC in an XML format.

I'm calling this SHARC's "2.0" release, and back-versioning the original distribution to "1.0" (even though I timidly referred to it at as version 0.921 at the time). In this blog article, I'll describe the design of this 2.0 version, for the convenience of anyone who would like to work with it.

Let's consider the XML that specifies a single instrument and all of its notes, and their harmonics. The rough outline of the XML is:



<instrument>
<note> <!-- first note -->
<a/> <!-- harmonic 1 -->
<a/> <!-- harmonic 2 -->
...etc...
</note>
<note> <!-- second note -->
<a/>
<a/>
...etc...
</note>
...etc...
</instrument>



The <instrument> element has the following attributes:
  • id: the instrument's short name, containing no spaces, suitable for variable names and querystring parameters
  • name: the instrument's longer, more descriptive name
  • source: the cd from which the tone originated
  • cd: the volume of cd
  • track: the track on the cd
  • numNotes: the number of notes for this instrument


Here is a sample <instrument> element:


<instrument
id="CB_pizz" name="Contrabass (pizzicato)"
source="McGill" cd="1" track="18"
numNotes="41">


The <note> element has the following attributes:
  • pitch: the notes pitch and octave number, e.g. c4 = middle C. Sharps are specified with the letter s, e.g. 'fs4' rather than 'f#4'.
  • seq: the sequential order number of the note in the series (i.e. starting at 1 with the first note)
  • keyNum: numerical location of the pitch on a piano keyboard, where middle C = 48
  • fundHz: the frequency of the note's fundamental (e.g. a4 = 440)
  • numHarms: the number of harmonics (i.e. the number of <a> elements to follow)



Here is a sample note element:

<note pitch="cs1" seq="2" keyNum="13"
fundHz="34.648" numHarms="287">


Finally we have the harmonic data itself, contained in the <a> element. The harmonic amplitude value is the text node of the element, expressed as a linear value (i.e. not in dB). The attributes for the <a> element are:
  • n: the sequential order number of the harmonic in the series (i.e. starting at 1 with the first harmonic)
  • p: phase, expressed in the range between negative and positive pi


Here is a sample sequence of a few <a> elements:

<a n="1" p="-1.686">32.91</a>
<a n="2" p="0.309">2131.69</a>
<a n="3" p="1.764">5878.0</a>


Using the brief names 'n' and 'p' keeps the size of the XML document lower. For similar reasons, the frequency of each harmonic is not given. To obtain the frequency of the harmonic, you simply multiply the value of the "n" attribute by the value of the "fundHz' attribute of the 'note' element.

As I said, that is a rough sketch of the XML; to simplify the explanation I left out some of the detail. In addition to what I have discussed so far, each instrument element, and each note element, has a sibling element <ranges> which contains useful metadata. Here is a sample <ranges> element for an <instrument> element:


<ranges>
<lowest>
<harmonicFreq harmNum="1" keyNum="12"
pitch="c1">32.7
</harmonicFreq>
<pitch
fundHz="32.7" keyNum="12">c1</pitch>
<amplitude freqHz="8449.15" keyNum="22"
pitch="as1" fundHz="58.27"
harmNum="145">0.0</amplitude>
</lowest>
<highest>
<pitch fundHz="349.22"
keyNum="53">f4</pitch>
<harmonicFreq harmNum="151" keyNum="25"
pitch="cs2">10463.69</harmonicFreq>
<amplitude freqHz="261.62" keyNum="48"
harmNum="1" pitch="c4"
fundHz="261.626">15389.0</amplitude>
</highest>
<pitches>c1 cs1 d1 ds1 e1 f1 fs1 gs1 a1 as1
b1 c2 cs2 d2 ds2 e2 f2 fs2 g2 gs2 a2 as2
b2 c3 cs3 d3 ds3 e3 f3 fs3 g3 gs3 a3 as3
b3 c4 cs4 d4 ds4 e4 f4
</pitches>
</ranges>


The logic behind the <ranges> element is mostly convenience for applications that will be constructing graphic plots from the data. For example, having the highest and lowest frequency specified here, rather than making it necessary to traverse through the data to find it, makes it easier for a program to set up the minimum and maximum for a graphic plot. The <pitches> element is another convenience that keeps the user from having to issue a thorny xpath query just to get a list of all the instrument's pitches.

Let's drill down into the details of this <ranges> element. The text node of the ranges/lowest/harmonicFreq element is the lowest frequency of any harmonic in the entire instrument's collection. Obviously, this is always harmonic 1 of the instrument's lowest note. The attributes for harmonicFreq convey this, as well as the pitch (c1) and keyNum (12). The element ranges/lowest/pitch contains the same information, but described in terms of the lowest pitch and its fundamental frequency. This redundancy has little impact since it is occurs just once for the instrument. Information about the lowest amplitude harmonic to be found in the instrument is given in the ranges/lowest/amplitude element. For the instrument in question, this honor goes to a#1 (keyNum of 22, fundamental frequency of 58.27 Hz), 10 semitones above the instrument's lowest note, the 145th harmonic (frequency of 8449.15 Hz).

The ranges/highest element provides equivalent data for the highest harmonic frequency, highest pitch and highest amplitude.

Here is a sample <ranges> element for an <note> element:

<ranges>
<lowest>
<amplitude freqHz="6475.19"
harmNum="198">0.0</amplitude>
<harmonicFreq
harmNum="1">32.7</harmonicFreq>
</lowest>
<highest>
<amplitude freqHz="98.1"
harmNum="3">2335.0</amplitude>
<harmonicFreq
harmNum="303">9909.0</harmonicFreq>
</highest>
</ranges>


This element provides data similar to instrument/ranges, but in terms of the highest/lowest frequency and amplitude harmonics for the note in question.

The XML was designed in a way that the entire SHARC dataset could be combined into a single XML file (i.e. as a series of instrument elements), and this file is in fact available for download in zip format. However, this file is quite large (nearly 3 meg), which will put quite a burden on parsers, and especially DOM parsers. For more efficient processing, I have placed each instrument into its own dataset file.

For a summary, here is a shorthand showing the overall design of the xml, with attributes shown in red and text nodes in blue:


tree

instrument (id, name, source, cd, track, numNotes)
ranges
lowest
harmonicFreq (harmNum, keyNum, pitch) [frequency]
pitch (fundHz, keyNum) [pitch]
amplitude (freqHz, keyNum, pitch, fundHz, harmNum) [amplitude]
highest
(all same as lowest)
note (pitch, seq, keyNum, fundHz, numHarms)
ranges
lowest
amplitude (freqHz, harmNum) [amplitude]
harmonicFreq (harmNum) [frequency]
highest
all same as lowest)
a (n, p) [amplitude]


I'm not attached to this particular XML design, and I may come out with a 3.0 version some day. One change I expect to make in a future version is to move a lot of information that is in attributes to elements, which means that more queries would return element nodes that could be further processed. Another idea I have is to make a secondary, "bare bones" release, that would have no metadata, for quicker processing.

Enjoy playing with the data!

Wednesday, August 15, 2007

IDEA with Tomcat 6 Integration

If you like running Tomcat from within IDEA and you want to be a Tomcat version 6, you need to stick with Tomcat 6.0.10 for a while. Any later version causes this complaint to come up when you launch Tomcat: "Error running Tomcat6: Cannot find configuration of jsp built-in servlet in C:\Users\greg\.IntelliJIdea60\system\tomcat_Unnamed_7dbqbe5b1\web.xml". I noticed just the other day that Tomcat 6.0.14 came out and I confirmed that this version has the problem too.

Friday, July 27, 2007

Vista Guinea Pig

I just bought myself a Lenovo desktop machine for my home office, and it came with Vista Business. This is the first time I've submitted myself to being a guinea pig for a new, pre-service-pack OS. Here are a few reactions, gripes and maybe even some left-handed praises.


It started out of the box okay, after answering all the usual first-time-start questions of name, timezone, etc. Early on, I started transferring files from one of my USB drive to the new disk, and I was appalled how slow it was going. Despite being a 7200 rpm disk drive, the time it took for the file transfer seemed about four times longer than it would have been on my Toshiba laptop, which is a 5400 rpm drive. Eventually I figured out that indexing was turned on for optimizing search and the disk was churning constantly. Since turning it off, file transfers copies are must more reasonable, although I have yet to try a side-by-side comparison. You can find instructions on disabling indexing on the web.

In the course of loading up my customary developer software, I had to use the Explorer a lot, set environment variables, etc. (Note that whenever I say 'Explorer' I always mean the file browsing app 'Windows Explorer', not the web browser 'Internet Explorer.') The customary alienation that one gets trying do to routine things in a new OS's GUI was running pretty high for me. Like every MS-Windows incarnation before it, if you don't want to blindly follow Microsoft's vision of where your files should be (i.e. "My Documents"), you have to work a lot harder. After a few evening's work, I know how to get around, in the course of which I learned two disappointments about Vista.

Disappointment 1: Vista is just a big shiny wrapper around MS Windows XP. Once you've dug deep enough, you find that the Explorer does little more than it did before, and all the Control Panel applets offer all the same functionality as before.

Disappointment 2: I'm guessing that the motivation for the Shiny Wrapper came out of a need to "keep up with the Jobses" :-) and give Windows a glassy, 3-d look like the Mac. But the imitation is so shallow and naive. I get the impression that it was designed by people who don't actually "get" the Mac. It's like they made decisions like "the Mac uses shiny red buttons in the lower corners, let's do that and then they'll like us too"...but the end result is an incoherent mess. With the clever GUIs that Apple makes for iPods, Macs, iPhones and the like, you immerse, understand and say Wow. The Vista folks wanted Wow, but all they're going to get is, "Sigh. Why?"

Okay, having gotten that gripe out of the way, I've noticed a few good things. I'm having no trouble loading open source and developer software on the machine. I've got Tomcat 6 with JDK 1.5 running. Ant, Vim, Cygwin, Gimp and Intellij IDEA are fine. I installed all of Office 2003 and so far Word, Excel and Outlook run correctly. But I've had some problems too. My cheap-o Visioneer scanner won't load. A favorite convenience app of mine, Shortcuts Map, will load and run, but I can't close the app without using the task manager.

My user 'home' directory are now c:\Users\greg instead of the old, space-character plagued c:\Documents and Settings\greg. As far as names go, I can see actually using that as my 'home' directory, except that it is filled with the usual junk that is unrelated to what I actually use my computer for: 'My Documents', 'My Music', etc. And not surprisingly, Microsoft still presents it in Explorer as though its a special entity, like Desktop and My Computer, and not just an ordinary folder, which it is.

Another good thing is that Explorer is now remembing recently used locations. It makes it much faster to get to your stuff that way. Nice to know that Microsoft finally found a way to do something the Mac has been doing for 20 years already.

Back to what I wrote about at the top, the indexing that slowed down the hard drive by a factor of four...I guess Microsoft, showing its usual insecurity over competitor's innovations, figured they needed to make Vista like Google, i.e. searchable. And they bet the farm on it to the point that they hoped that users wouldn't mind if the first 7 hours of their Vista experience with a disk drive constantly churning and taking away productivity. Can they really be so clueless? Indexing, whether for a 160 gigabyte drive, or a giant corporate website, should be done in the early morning hours, when noone is at work, or at least on dedicated machines. Oh, they could have included some instructions to this effect: "After you finish using your new PC for the day, we suggest that you run Index Manager (tm) and leave your machine on overnight. The next time you use your machine you will find that you can search the entire computer quickly and easily." But I don't think that fits in with Microsoft's estimation of their user base's intelligence.

The conventional wisdom I've read on the net about Vista, and which I now agree is: don't be a guinea pig, stick with XP until Vista's first service pack comes out. But if you're buying a new machine, and Vista is forced upon you, and you can afford a few days to re-tool, Vista is fine. You'll just be that much more on top of things when the first service pack comes out and you'll be wanting to switch...because presumably Vista has a bunch of features that we'll be wanting. As I discover what they are, I'll write another blog entry about it.

Friday, July 13, 2007

Maven2 Introduction part 1: the Coordinate System

Maven is gaining traction as the premiere form of Java code organization and managing builds. The entire world won't convert overnight, but adoption is likely to steadily increase once more people get over the learning curve and conceptual difference with ant scripts, the previous prevailing model of build management. In this article I'm going to try to take a bit of the steepness off of the learning curve for you.

The big sell for Maven is the dependency management and the coherence it brings to both your individual projects, and your code development overall. And by dependency management, I mostly mean where and which jar files you use. For ant users reading this, this is a way more than what the "depends" attribute of an ant target gets you. It's a way of:

  • Avoiding confusion between different versions of the same jars

  • Maintaining only one copy of the same jar on your computer (instead of having one copy for each of your projects that use it)

  • Having a mechanism that retrieves jars (and making sure it is the right version as well) from the internet for you without you having to think about where it should be stored



Maven does this my having a highly structured approach to dependencies, and importantly, the adherence to this framework by the community who uses Maven. In this part of the article I'll start by covering the cornerstone of the Maven approach, the "Coordinate System," then we'll move on to Maven repositories.

The Coordinate System: groupId, artifactId and version


At the core of Maven 2 is its method of identifying resources (mostly jar files) by a strictly followed practice of file and directory naming. The goal is similar to that of XML namespaces and the java packaging conventions: to define items as distinct points in space according to a unversally followed set of conventions. It's simply these four identifiers:


  • groupId: usually a reversed domain name such as com.lowagie.

  • artifactId: a common name for the resource, such as itext.

  • version: a version indicator such as 1.4. Numbers and decimals are typical, but not required, values for the version.

  • packaging: the type of end product which could be ear or war, but is most often jar (and therefore the default, so packaging need not be specified)



(A side note about the groupId: there are many jars out there that do not use their organization's reverse domain name. In fact, they comprise some of the most widely used jars out there: log4j, jdom, ant, and xalan to name a few. All they use for their groupId is their simple well-known name (log4j, jdom, ant, and xalan for the examples just mentioned), and their artifactId is the same. These famous jars just happen to have been on the scene during an earlier version of Maven before the convention of using reversed domain names took hold; they held on to their old coordinate locations instead of updating.)


  1. Here is the location of a jar file named itest-1.4.jar in a proper Maven repository. The initial part is chosen by the individual user (c:/.m2/repository) but everything following that is dictated by the coordinate system:

    c:/.m2/repository/com/lowagie/itext/1.4/itext-1.4.jar

  2. In the same directory as itest-1.4.jar, you will find the file itest-1.4.pom. This is an XML file containing the following:


    <project >

    <modelVersion>4.0.0</modelVersion>

    <groupId>com.lowagie</groupId>

    <artifactId>itext</artifactId>

    <packaging>jar</packaging>

    <version>1.4</version>

    </project>




  3. In the project's root directory, you will find a file called pom.xml, and that file will contain the same lines as above, but wrapped inside a <dependency> element:


    <dependency >

    <modelVersion>4.0.0</modelVersion>

    <groupId>com.lowagie</groupId>

    <artifactId>itext</artifactId>

    <packaging>jar</packaging>

    <version>1.4</version>

    </dependency>




  4. Retrieval of resources over the internet is integral to maven. A jar can be referred to by its URL on a known repository. The first part of the URL is specific to the repository, whereas the rest follows the file structure of the coordinate system. Here is a URL for the location of a jar at the well-known repository ibiblio:
    http://mirrors.ibiblio.org/pub/mirrors/maven2/com/lowagie/itext/1.4/itext-1.4.jar

  5. Now here is a Maven command line statement. It's purpose is to install a jar in a repository, but don't worry about that right now; just notice how the coordinate system manifests itself on a typical command line statement.


    mvn install:install-file -DgroupId=com.lowagie \

    -DartifactId=itext \

    -Dversion=1.4 \

    -Dpackaging=jar \

    -Dfile=itext-1.4.jar


  6. Occasionally a point in space is referenced with a single line of text, using the format

    groupId:artifactId:packaging:version, as in:

    com.lowagie:itext:jar:1.4


These 5 situations show you most of the ways in which jars are referenced in the Maven world. There is a maddening consistency and pervasiveness to the Coordinate System throughout Maven. The more you learn about Maven, the more you discover you've already learned it.

Sunday, October 22, 2006

S-Corps for Software Contracters

Do you prefer 1099 or corp-corp?


Ever been asked that by a recruiter or HR? Did you freeze like a deer in the headlights?

If so, it's probably because you've spent your entire career up until now in Full-Time, 'Permanent' positions, and you're about to do your first real independent contract. In many situations going 1099 is the least costly and least complicated route to take. But, if a client ever insists that you do "corp-to-corp" (as happened to me last year), you'll need to incorporate and take on a few added responsibilities and expenses.

If you ever find yourself needing to do corp-to-corp, this guide will give you a real leg-up on the process. Before I start though, I need to lay out some disclaimers, which I assume you will keep in mind at every point as you read this article:


  1. I am neither a lawyer nor accountant. I am not giving professional advice, only sharing information from my personal experience. I provide no assurance that it is accurate or complete. Before you replicate anything from my experience you should get input from a qualified lawyer and a qualified accountant and consider their advice primary. I am not liable for any legal or financial woes you incur from following or failing to follow this.

  2. My company is in Illinois and is incorporate in that state. I did no research to find how tax & corporate law works in other states. Your accountant and lawyer will have to advise you on the peculiarities of your state.




That being said, let's get started.

The Gory Truth, All At Once


Here is an executive summary of incorporating and running a corporation, at lightspeed. The usual choice for an individual is the S-Corp. The idea is you are setting up a corporation in which you are the sole employee. The legal process of drawing up Article of Incorporation must be undertaken; your lawyer can do it, or you can do it yourself on the web. You're going to put yourself on payroll. You're going to do all the deductions, and file and pay them yourself to the Fed and your State. Some of these deductions have to be matched by the company, which means you pay them twice. You also pay the state and the fed unemployment insurance. The filing and payment of all these various taxes happen on different timescales: some are monthly, some are quarterly, and one of them is yearly. Plus, big corporate clients will require you take out insurance as well.

Why Incorporate


You heard right: double the usual medicare & Social Security, state and federal unemployment, fees for filing articles of incorporation, and business insurance. So why take this on, then? Because an employer may require corp-to-corp: it's how they keep their nose clean with the IRS. The IRS comes down pretty hard on tax-evading companies who try to hide part-time employees by calling them "contracters". Companies know that this all costs you extra money, so your hourly rate is adjusted appropriately.

Or you may want to be an S-corp because you expect your business to grow (i.e. hire employees), or perhaps the legal formality of the corp. status raises your standing in the profession. S-corp is not the only choice; an LLC (Limited Liability Corporation) is also a possibility that meets the Corp-to-corp requirements. LLC's are usually for when you have one or more partners in addition to yourself, so if incorporation is really just for you to run a company of one, S-corp is the obvious choice.

Articles of Incorporation


It is not strictly necessary to engage a lawyer. You can actually incorporate online with sites like www.BizFilings.com. I think there are several good reasons, however, for choosing a lawyer. Your lawyer will have the best advice for your state, and he/she will be available to answer your questions. A website that serves a national clientele can't answer questions about your state. Unless you are comfortable with legalese, plowing your way through all the documents will be a major distraction. The cost of a lawyer and the online costs are pretty much the same. It cost me $600 to incorporate.

The process itself not a big deal, really. With a lawyer you'll take care of the whole thing with one phone call, a few days wait, and one in-person meeting. The lawyer asks you a few questions, you sign a few forms, you get a stack of papers, and your 9-digit FEIN (Federal Employment Insurance Number).

Your FEIN becomes the key to everything. It shows your client that you're incorporated; it sets you up for paying taxes; it allows you to create an account with a bank in your company's name. Pay close attention to that last one: if you bill your client for two weeks, and he writes the check to XYZ Corp, the bank ain't gonna let you cash it on your personal account. Get your FEIN and your company bank account in order before you start billing.

Creating Your Payroll


Running a payroll is more records-keeping than anything else. You do not need to buy some Quicken product or engage a professional to handle payroll; you want to roll up your sleeves, handle this yourself, and know exactly what is going on with your money. First make a template for a pay stub. Keep it simple, just copy a paystub from one of your recent W2 employers, including deductions for Medicare, Social Security, Federal taxes, and State taxes. Distinguish between gross and net pay. You can do all those Year-to-Date columns on the gross pay and deductions too.

Pick a frequency of pay. What makes sense here? First of all keep in mind that your invoicing (billing) to your client is completely decoupled from your salary, and there's no need or benefit to keeping them in lock step. I chose bi-weekly just so I could recreate the amount of cashflow I had at my last W2 position. Whether you do bi-weekly or something else, it would be wise to stick to it and pay yourself on every pay period; otherwise, having payroll tax filings that change wildly from month-to-month doesn't look like you're really running a business.

Pick your salary. All your taxes combined are going to run about 36% of your gross, so for sure don't pay yourself more than that. But you should still probably pay yourself less than the remaining 64% because you'll have other business expenses (which I'll covere later). One self-employed computer contractor I know said to me "I pay myself like a secretary." Why not pay yourself cheap? The money that stays in the company account is still yours anyway. You can give yourself a quarterly bonus to make up for shortfalls. (And yes, all the taxes apply to bonuses as well.) The key advice is, be conservative on this from the start, and you won't find yourself in a panic later.

Your employee taxes


Here are the figures for calculating the various taxes that I used in 2006 (in my state of Illinois) for each pay stub. Social Security is 6.2% of gross; Medicare 1.45% of gross; Federal tax is (25% of gross) - $336.55; State tax is (3% of (gross - $1000)) + $30. Where do these figures come from?


  • Federal tax figures: The IRS website, www.irs.gov gives you the "Tables for Percentage Method of Withholding" (see links below) and apply them to your frequency of pay. These tables cover circumstances such as being single or married and number of exemptions.

  • State tax figures: The calculation process is similiar, and possibly more simple. In Illinois, once your gross for the pay period is over $1000 you follow one simple formula. See the links at the end for the location of Illinois tables.

  • SS & Medicare: These figures rarely change; the two rates of 6.2% and 1.45% have been constant since at least 1997.



Those "other" taxes


Now as a business you have to duplicate those Medicare and Social Security taxes for each employee (i.e. you) and pay them AGAIN as the "company contribution". Your W2 employers have been doing all along for you, and now you're doing it for your own company.

Next there is unemployment insurance. I don't mean a policy that you take out from State Farm; I mean a mandatory tax you pay to the govenment. In case you didn't know this, your most recent employer is the one footing the bill if you go on unemployment. If you work for ABC Corp for a year, then lose your job and go on unemployment and get a few hundred dollars in benefits from the state every two weeks, that is ABC Corp who is paying for the lion's share of that through their unemployment insurance contributions. And your corporation will have to make those contributions now. Just think of it, taking money out of your own gross pay to pay yourself if you go on unemployment! (Seriously though, you might want to talk to your local Department of Employment Security before you try that one.)

There are both State and Federal unemployment taxes to pay, but the Federal one is negligable...a fixed amount in the area of $60 for the whole year. In Illinois, the rate is 4.2% and figured quarterly, on a maximum of $11,000. So let's say you gross $8000 one quarter, your State unemployment tax is 4.2% of $8000. If you gross over $11,000, say $12,500, you only pay 4.2% on $11,000.

So What Should My Hourly Rate Be?


Very good question! Perhaps learning about all these extra costs has made you you run away screaming from the idea of doing corp-to-corp. Don't. These are all known expenses of running a business. Work them into your hourly. If a company has you doing corp-to-corp then you've let them off the hook for medicare, social security and unemployment insurance, and they know full well that you're shouldering it in their place. Plus that company is not paying medical insurance or retirements benefits costs for you either. So don't be shy about upping your hourly. But how much is reasonable?

Perhaps you know your market value in terms of a yearly salary in a normal W2 job. What is the reasonable mathematical equivalent for a corp-to-corp? That will be useful in establishing a baseline. Perhaps there are reasons for a contract to pay you much more than that; maybe it is a short contract with huge demands and a high penalty for failure, and located 1000 miles away. But that's for you to figure out. For now, we'll just talk about an hourly that leaves you just as well off as if you had a W2 at your current market value.

Imagine a hypothetical worker named Sue earning $80k gross on a W2. (We're going to leave medical and retirement benefits out of the picture, now but more on that later.) That works to about $38.47/hour. Sue's net is coming to about $59.5k/year because her Medicare, SS, State & Federal taxes come to about $20.5k/year. Now, let's transform that into a corp-to-corp position. The company portion of medicare & SS, and the state and federal unemployment taxes will come to an additional $8.9k/year and Sue's net annual salary has gone down to $38.6k/year. So her $38.47/hour puts Sue way below what she was earning on a W2. Boost her hourly up to $53.45, though, and then Sue hits that net target of $59.5k/year.

But now Sue wants medical coverage, which, say, costs her $1000 a month. At her last W2 she had a nice medical plan that she paid $200 a month toward, so the value of what she is sacrificing by going corp-to-corp is $800 a month. Translate to an hourly for a year's worth of coverage, and that's $4.62/hour. On a W2, medical coverage is pre-tax, but Sue will have to pay tax on an extra $4.62/hour of income, so it should be bumped up to $6.28/hour. Sue is now asking her client for a rate of $59.73/hour.

Now Sue is still out customary W2 benefits like Retirement Plan, Life Insurance, Disability Insurance, paid vacation, and who knows whatever perks her client offers their full-timers, like Health Club membership. And she's got her business expenses, like indemnification insurance, keeping her laptop and home network in good condition, printer toner cartridges, and so on. Sue might feel justified in adding $3-5 more to her rate. But I would advise Sue against sharing these calculations with the client up front. Clients do not feel they are responsible for every aspect of your financial safety net and company operations; what they would say is, "hey, you're a contractor, those are your issues." Instead, Sue keeps the information in reserve. If they balk at her rate of, say, $62/hour, she can just refer to all those other things in passing: "I haven't even mentioned all the other expenses I am shouldering like business costs and insurance, so I think my rate is fair."

Filing and Paying Taxes


Okay now. Decoupling the money your company makes from the salary you pay yourself is a real trip, right? It gets better still when it comes to filing and paying taxes.

Looking at the paystub from your last W2 employer, you might get the feeling that paying taxes is all sort of automatic and done for each paycheck. Not at all. "Payroll taxes" include Federal & state withholdings and both the employee and employer Medicare & Social Security contributions. The State portion covers only the state withholdings; everything else is considered the Federal portion. You file these figures once a quarter, but pay them once a month. If you fall behind on either (I think the grace period is about 15 days), you'll be penalized and have to file separate forms to cover the penalty...nothing crippling, but unpleasant still. The fact that you pay earlier than you tell them how much you owe them, and then can be penalized for getting it wrong, underscores how important it is to keep extremely clear payroll records.

At the time of this writing, Federal filing can only be done with a paper & snail-mail process. Go to the IRS website and download Federal Form 941. You'll notice that it does not distinguish between the employee and employer contributions for SS & medicare; you lump them together. You'll see that you can also send your payment along with this form, but you have the option to handle the payment online, which I'll cover in a moment.

State tax withholdings can be filed in Illinois via the Illinois TaxNet system. (Hopefully your state has a similar system.)

Payment of payroll taxes is once per month. When your W2 employer takes deductions for tax, medicare, SS, etc., they're actually banking that money until payroll tax time at the end of the month. Both state and federal can be paid online. The federal portions (tax, medicare and SS) can be paid with the EFTS system (see links below) and Illinois state tax can be paid with Illinois TaxNet.

Now for unemployment insurance. In Illinois, you both file and pay state taxes quarterly, using Illinois TaxNet. For Federal, you also file and pay together on EFTS, and only once, at the end of the year. And remember, it's a very small amount (0.8% of gross earnings).

Summary of Times and Activities


Here's quick summary of everything that happens over time at various points:

  1. Invoices are submitted to your client, frequency of your choosing.

  2. Client auto-deposits or mails you a check written to your company bank account.

  3. You generate a paystub for yourself every two weeks indicating the gross earnings, the tax withholdings, and the net amount. You write a company check to yourself (your personal name) for the net amount and deposit it into your personal account.

  4. At the end of each month (with 15 days grace) you pay state and federal payroll taxes for the amounts you withheld in salary to yourself during that month.
  5. At the end of each quarter (March 31, June 30, September 30, and December 31, with 15 days grace for each), you file the state and federal payroll taxes deducted during that period. You also file, and pay, state unemployment tax at each of these quarter ends.

  6. At the end of the year, you file and pay federal unemployment tax.


Why Incorporate, More Reasons



  • With an S-corp you've got all the apparatus for growing into a larger company. Your payroll mechanism scales to multiple employees with no necessary changes.

  • You come to understand how companies work, and your role vis-a-vis clients and the middlemen that separate you.

  • If you go back to W2 employment at some future time, when you get your paystubs, you can go over the figures to see if their withholdings are correct, because you know how they're supposed to be calculated.

  • Using your company as a formality for keeping track of job-related expenses. Need a technical book, or a new wireless router, or some paper clips? Charge it to the company card, then write it off. Incorporation gives you some added authenticity in the eyes of the IRS.



Spreadsheet Tips


My major advice here is to develop a passion for keeping very exact records, and get handy with MS-Excel if you aren't already. Whatever flavor of spreadsheet you use, here are some ideas for you to get started.

  • In one worksheet, I maintain one row for each pay period, with formulae for converting each gross pay to each of the withholdings in separate columns. I copy this row to another worksheet that uses those fields to generate a paystub.

  • Divide the pay period rows into four separate groups corresponding to the quarter of the year. Your state unemployment insurance payments are computed according to quarter.

  • Use another part of your worksheet to keep tracks of each of the categories of tax payments that you make. Label them according to the month and quarter they were for. The steady bi-weekly, monthly and quarterly paperwork, filings and payments become a blur after a while, so keep good records. The 15th day of the month into a new quarter can be a panicky day if you're unsure of your filings and payments.

  • Use another part of your worksheet to show what taxes are currently owed by comparing the payroll rows with the Tax Payments Made records. And finally you can calculate the financial state of your company: money in the bank after you've made payroll, minus the taxes not yet paid.


Other Tips



  • For each transaction relating to salary, keep together copies of the pay stub, check, cancelled check and bank statement line item. Staple them all together and file them away. Do the same for each transaction relating to invoicing and client payment. You'll have so much order in your records, you'll be wanting to dare the IRS to audit you.

  • Keep track of your expenditures using your company card or checks for things other than salary. Now here is a case where I recommend using something like Quicken instead of rolling your own Excel spreadsheet. Don't go overboard and invest in QuickBooks or anything, although Quicken "Premier Home and Business" is worth the extra few bucks.

  • It's inevitable that your personal and business bank accounts will intermingle on occasion. Here's two things that happen to me a few times a year: I buy groceries with the company card because I forgot the personal card at home. Or, I absent-mindedly use my personal card to buy a piece of networking hardware that I want to be a business expense. I de-mingle these as they occur, and keep a paper trail. I have a separate forms that I use for payment back to the company, or payment from the company, and I copy the check, cancelled check, and bank statement line item for each one, as I do with other expenses.





Links



Helpful keywords


Words and phrases for search engines that can help you find documents that you need are:
payroll tax, bi-weekly payroll period, FICA witholdings, Social Security tax rate, Medicare tax rate,