Friday, August 4, 2017

Remembering "Fool's Gold"

In the mid 80's I moved to Ithaca, NY, home of Cornell University and Ithaca College.  I didn't move for school; I just thought Ithaca would be a cool place to live.  I played music, worked as a research assistant, put hours in at the Food Co-op, made friends, took walks in the rural surroundings, danced, ate vegetarian, and had lots of fun.

I was at a New England-style contradance when I saw Fool's Gold providing the music.  They were interleaving Klezmer with celtic fiddle tunes and other traditional repertoire, and the audience was going nuts!  I walked right up and asked if they needed a pianist.  One rehearsal, and I was in!

So we were:  Eric Pallant, clarinet; Paul Viscuso, accordion; Betsy Gamble and Willow Soltow Crane, fiddles; Ted Crane on percussion and calling; and me on piano.

A lot of the Klezmer energy came from Eric.  First, he knew the Klezmer tradition:  traditional songs by Sholom Secunda, The Barry Sisters, and energetic new instrumental tunes by Mike Bell.  Second, he could make his clarinet wail like a banshee and quack like a duck.

Paul had a knack for composing and finding soulful, beautiful tunes with multi-part melodies that made for some wonderful romantic waltzes.  He was also a titan of the keyboard accordion, and could make the dancers really move with his rhythm.  He was the band's leader, too, and he did it with just two words:  NEXT!! (for changing tunes) and OUT!! (for ending the dance set).

Betsy Gamble had incredible traditional fiddle chops, and Willow no slouch either, and together they worked out many double-fiddle arrangements that brought a lot grit to the songs with the rosin on their bows.

Ted called the dances, and added some rhythmic drive with his percussion.  He played the bones, and, to make our Contradance-Klezmer fusion all the more absurd, played the Irish Bodhrán.  

Fool's Gold had a great tonal palette.  Fiddle, clarinet, and accordion makes for a magical, sensual timbral quality when you play them on a melody, or in harmonies.

Something I love about Klezmer is that you almost can't exaggerate the craziness.  One thing I used to do on the piano is an upward glissando from the lowest bass notes right before nailing the root on the downbeat.  It makes a funny, growling, sweeping sound.  In a lot of music I'll save such an effect for one key moment, because a little bit can go a long way.  Not in Klezmer!  You can do it over and over, and the music just gets funner and funnier.

Once we played at the annual Ithaca Festival.  During one of our Klezmer numbers, the crowd spontaneously formed an Isreali-style circle dance.  I love New York.

Life in Ithaca NY doesn't stay still much, because many people are on the move.  People are in town for a few years for school, and then move on.  In 1987, some of us were moving on to teaching jobs, others moving away for grad school elsewhere.  So we gathered at Andy Ruina's house on Teeter Road and recorded using his wonderful baby grand piano with great growling bass notes.  We did it in a short time, with no overdubs, and very few takes.  If there was a blemish here or there, or one or two songs were maybe not quite ready for prime time, we laid them down anyway and put them out there.  It reflected the spontaneous nature of Ithaca, our youth, and the joy of living in the moment.

I have just put all the tracks of our cassette (and later CD) "Contras From the Old Country" up on YouTube.  The playlist of the full album is here.  Enjoy!

Thursday, June 29, 2017

New Roxy & Elsewhere video!

Roxy And Elsewhere (1974) is held by many Frank Zappa fans to be his single best album, so it was a  seismic event when the video of the live performance became available in the form of the Blu-Ray release Roxy the Movie.

Now new video from the same performance, but absent from the Blu-Ray release, has come out on YouTube.  I don't know why they were excluded, but it could be due to the more "adult content" nature of two of the three songs, which in this 21 minute clip, are:

  1. Pygmy Twilight, featuring the erotic antics of Pamela Miller, a long term Zappa family friend and actress in Zappa projects.  If you thought the "Brenda the Harlot" segment in Roxy the Movie (during the Be-Bop Tango of the Old Jazzmen's Church) was spicy, that's nothing compared to this.  As a release, it leaves something to be desired because Ms. Miller's caresses interfered a fair amount with Napoleon Murphy Brock's singing.
  2. Oh No, a standalone version without being followed by The Orange County Lumber Truck as it often was.  
  3. Dickie's Such an Asshole.  Performed as an encore, with minor 'audience participation' at the end.  It's a shame this Zappa song is lesser-known, as it's a good one.  

Monday, June 26, 2017

"Don't Die" Twin Peaks s03e06 Spoiler Summary

What could be more Lynch-ian that doubling down on frustrating your audience?

Kyle McLachlan's acting talents are still being spent on the nearly-a-zombie "Dougie" character.  The needle has moved just so slightly in that we learn he's an Idiot Savant at accounting fraud.  Though a series of lines, doodles and drawings of ladders, he was able to show his boss something actionable in some case files.

I suspect that Lynch is going to finesse this all in the end, and here is my theory:  Agent Cooper is actually sentient inside Dougie.   He's stuck inside there just like he was stuck in the red room, the intermediary limbo dimension, and the glass cube.  Like prisoners of war who find ways to communicate through taps and scratches, Coop has found a way to manifest crude actions through Dougie.  When Coop finally comes out we'll learn he's been solving crimes and tying together the loose pieces of this season's numerous plots.

And Lynch is still introducing them.  We have a semi-midget hit man who kills with an ice-pick, a psycho organized crime boss, and a disgruntled underling of same who runs over a child in his truck, and the return of the trailer park manager from "Twin Peaks: Fire Walk With Me."

And so I'll solider on, hoping for better in episode 7.

Monday, June 12, 2017

Twin Peaks s03e05 Spoiler Recap

The title of this episode is "Vegas, Baby."

Not the best episode so far.  The dangerous-Agent-Cooper-in-jail story line has moved along an inch or two, when Cooper uses his one phone call to somehow (with touch tones alone) hack into the jail's alarm system.  We see the incapacitated-Agent-Cooper-as-Dougie at work (he's in Life Insurance), but it all seems for comic effect; story line barely moved at all.

Most of the rest of the episode is devoted to short vignettes introducing new characters with new threads, which we know from Lynch' Twin Peaks work in general (especially season 2), might be relevant, or might be false leads.  For some reason Lynch likes producing that kind of fatigue in his audience.

Specifically:  we meet a young couple who is sponging off of Shelly, who still works at the Double R Diner ("NOW, Shelly"...that Shelly), a misogynist/drug dealer who hangs out at the Bang! Bang! bar, and a military career gal who's going to get the "Area 52" branch of the military involved in the ongoing murder investigation in South Dakota.

There is one bright spot in the Dougie story.  Although he has a long way to go to recovery, the incapacitated-Agent-Cooper-as-Dougie shows momentary sparks of recognition.  Every now and then someone says a word or phrase that strikes a chord in Dougie's FBI self, such as "case file":  he stops, repeats the word, and seems to think about it.  The best one is the effect the word "coffee" has on him, that really gets a rise!  He actually gets his hands on a cup of Starbucks and suckles it likes it's a baby bottle containing life itself.

The brightest spot in the episode is our learning of what Dr. Lawrence Jacoby has been doing with himself.  We've seen him receiving a shipment of standard hardware store shovels via UPS.  Later, he painstakingly spraypainted them gold.  And now we see that he hosts, under the name "Dr. Amp",  a periodic video/podcast show about conspiracy theories, the evils of government, and the poisoning of our environment by multinational corporations.  He's got a great audience:  Nadine, the eye-patched wife of Big Ed Hurley watches.  And Jerry Horne, the ne'er-do-well younger brother of Northern Lodge owner Benjamin Horne, tokes up while listening. 

And how does Dr. Amp fund his show?  He offers his audience a solution: they have to "dig themselves out of the shit."  He even shows himself shoveling himself out of a waist-deep pit of brown muck, with what else but his $29.99 Gold Shit-Digging Shovel, available by mail, Order Now!  By God, Lynch still has the comedy touch.

Sunday, June 11, 2017

Twin Peaks s03e04 Spoiler Recap

There's a Lynch humor piece in this episode.  You don't laugh at it.  It's only funny in the way some SNL skits are badly acted but hilarious in their premise.

Back in Twin Peaks, the sheriff's station still has klutzy, whiny-voiced Lucy working the front desk, and hapless (and useless as ever), deputy Andy Brennan, sporting an absurdly high cowlick, along with a middle-aged paunch.  Not being terribly good actors, and lacking their youthful charm of the 90's, their return hasn't been particularly delightful.  The new element, though, is that they have a son named Wally Brando.  In fact, in this episode they are just receiving the news that Wally Brando (always mentioned by full name) has just arrived in town.  You see, Wally Brando is a soul of the road.  He rides his motorcycle hither and yon following his heart on an ongoing discovery of the American spirit.  Lucy and Andy are proud of him for it the same way that the parents would be for an olympic medal or Nobel prize winner.

Joke 1 is that Wally Brando is decked out in a perfect replica of Marlon Brando's biker from The Wild One (1953), and joke 2 is that he's played by Michael Cera.  Joke 3, the cruelest of all, is that Lynch graces us with a five minute (feels like twenty) soliloquy by Wally Brando, motionless, with Andy and Lucy looking on admiringly from his side, a seriously-delivered but cliché- and pablum-laden discourse on the truth and goodness of the great American highway.

Like I say, noone's laughing yet it's funny as hell.  Michael Cera doesn't even ride the motorcycle, just sits on it.  

The rest of the episode is similarly light in flavor, compared to the macabre dimension-travelling of the previous.  Agent Cooper's earthly persona is still mentally incapacitated, and having replaced the Cooper lookalike "Dougie", he is still assumed to be Dougie by those around him.  He pretty much just bumbles around like Peter Seller's Chance the gardener from Being There.

The dangerous hood Cooper character is now in jail, having been found with firearms and drugs in his wrecked car.  This turns out to be the Cooper that the FBI locates, not Dougie.  They have an interview with him, in which the jailed Cooper does a very leaden impression of the real Agent Dale Cooper, with obviously rehearsed lines, which fools the FBI not one bit.  In the meantime, we're given a major reveal about who this Cooper is (although it's been pretty strongly hinted).  It's Bob.  

He is Bob!  Eager for fun.  He wears a smile.  Everybody run.   
- One-Armed Man, Season 1   

Somewhere back in Twin Peaks history there was a scene where Cooper stared in a mirror and suddenly thrust his forehead right into the glass.  In the cracked, bloodied mirror, his reflection is Bob.  We also see the two of them together in the red room, laughing maniacally together like satanic frat brothers.  So the conclusion is that Bob took over Cooper's body, trapped Cooper's real identity in the Red Room, and has been running amok for the last 25 years.  Something's doesn't line up, though; Bob isn't inhabiting a being for evil (such as killing one's own Prom Queen daughter), he's more of a hit man for organized crime.

Andy Brennan, Wally Brando and Lucy Brennan

Twin Peaks s03e03 Spoiler Recap

In episode 3, there are actually three Kyle McLachlan characters.  Dale Cooper, FBI agent wearing his familiar black suit, is trapped in the other-worldly red room.  #2 is the hard boiled criminal character, also referred to as Cooper, and involved in FBI work.  Number three is a kind of duncey character named Dougie who lives in Las Vegas and sees prostitutes, but we've only seen him just lately, and briefly.

The entire episode (1 hour) is dedicated to showing the movement of characters in and out of their dimensions, and how they swap bodies.  We know from other films that Lynch loves to move the camera into other worlds.  The camera will follow the sound coming from a telephone earpiece, and go right through the holes in into the electronics.  Or the camera will travel through walls to show the spaces inbetween rooms, with dust, drywall matter, and rodents.

So here Lynch seems to have put quite a lot of thought into the experience of being in a strange dimension.  The black-suited Cooper leaves the red room and enters into a creepy limbo.  There is a woman with missing eyes who can't speak, in a room with a strange steampunk apparatus in a wall panel, and a metal door somebody is heavily pounding on from the outside. They climb up a ladder, open a hatch and climb out onto a platform suspended in outer space, very much like Le Petit Prince standing on one of those little planets, but more terrifying.  The woman falls of the platform and presumably falls for eternity.  Eventually Cooper climbs back down, and a woman in red by a fireplace advises that he must go NOW.  Cooper ends up passing through the steampunk apparatus, painfully...and leaving behind his shoes.

The other Cooper, and Dougie, back on earth, experience extreme vomiting.  Dougie shrinks to nothing and black-suited Cooper comes out of a wall socket in the form of a black gas, eventually appearing in solid form laying on the floor.  His faculties haven't recovered, and he is a Rain Man-style imbecile.  He ends up in a casino, and starting with a $5 bill, wins enormous payouts from every slot machine he tries.  He eventually gets picked up by the authorities, and he's reported to FBI headquarters where Gordon Cole, the hard-of-hearing senior FBI guy played by David Lynch himself, celebrates the news of Cooper's final return.

The tough-guy Cooper ends up in a car wreck, and we don't see what's become of him physically.

Twin Peaks returns! Season 3, episodes 1-2 spoiler recap

I just finished episode 1 of the new TP, which is two hours of David Lynch amazingness. It's real "Holy Fuck" stuff. However it is not a return to the goofy fun of the original TP. In fact, of the many story lines, few have any evident connection to the Laura Palmer story. The short bits that have original cast members are sometimes weak and perfunctory, as though DL is using a TP season three as a thinly veiled opportunity for doing new work that interests him more. But in other ways not so much...too soon to judge. In sum, this is very possibly one of DL's great works and deserves to be taken seriously. But it's also true that if you're not up for a lot disturbing horror material (masterfully done), TP season 3 may not be for you.

I'll say this about the "Dale Cooper" in this episode. He's not just a linear extrapolation of the quirky, one dimensional FBI man of years ago. He's more the product of someone who has had dramatic life changes and choices over the course of 25 years, i.e. like real life. And in keeping with Kyle McLachlan's acting weight, he is carrying a lot of the film.

Log Lady appears to have been filmed "in time" before the actor's demise, but her illness is evident in her scenes.

Some actresses with girlish charm in their twenties keep it on into their fifties, but Sheryl Lee (Laura Palmer) doesn't seem to be one of them. She and Dale Cooper are in dream sequences in the red room once again, but they don't work as well.

So the thing about Dale Cooper (or whoever Kyle M. is supposed to be...not clear)
is he's now a real "heavy", a very dangerous guy, who has various shady thugs people working for him and "gun moll" style girlfriends, all of whom are under constant threat and intimidation by him. Picture below.  He wears a leather jacket, long hair, speaks in a low-pitched gruff voice, his scenes are all in seedy hotels and restaurants and rustic lodges, and has a bit of a southern air to him, ...someone joked that it's "Twin Peaks meets Duck Dynasty."   So he seems like a criminal, but at the end of episode one he gets on a computer and logs into an FBI portal, so he must actually be in some kind of deep cover.

Log Lady is sad to watch, as you can see from the pic below, she really was very ill when they shot her scenes.  She's holding the log in that pic.  She is instructing "Hawk" (Sheriff Truman's spiritual American Indian deputy) with clues (i.e. what her log is telling her) about where to find things, over the phone.

Anyone of our age group has gotta detest the actor who plays James Hurley's ability to defy age!  23 yrs old then, 50 now, he looks like a hipster single guy in his 30's.

One more thing, there are lots of clips from the original season 1 series, and the very opening clip is original footage of Laura Palmer in the red room saying "I'll see you again in 25 years."  Yes, that really happened!

They also re-enact, in their current characters, a famous bit of Red Room dialog from season 1:
Cooper:  "Are you Laura Palmer?"
Laura:  "If I feel like I know her, but... (in agony) sometimes my arms bend back."

Dale Cooper, the heavy

Log Lady, passing on clues from her Log to Deputy Hawk

Log Lady as we knew and loved

James Hurley, defying age

 Deputy Hawk

 Laura Palmer

Thursday, February 23, 2017

The Washing Machine

I recently started collecting Boogie Woogie tracks off of YouTube and learned some interesting stuff along the way.

I started out thinking about BW as being some early form of rock and got excited about finding songs from 1930's and earlier that had BW traits, thinking they were extraordinary gems.  Then I found out (see Wikipedia for this) that BW goes back to even before the twenties, and indeed, is as old as jazz itself.

Next I learned that BW became kind of a cultural virus around 1947, and virtually everyone was recording at least one BW song.  Almost every song from that period has 'Boogie' in the title, usually "XXX's Boogie".  All the hitmakers of the day have BW songs, Woody Herman, Count Basie, Tommy Dorsey, etc.  Check out YouTube and you'll find scores of BW playlists.

But it gets more interesting than that.  BW began in east Texas logging camps and developed organically with the expansion of railroad.  Ground zero was (is) Marshall, TX, where they celebrate that heritage to this day.  BW style in Marshall remained unchanged, but at each train stop (logging camp) the style became more evolved.  You could estimate the distance from Marshall by how evolved the BW style sounded in the bar you were in.  BW pianists spent their years traveling from camp to camp on the RR.  And to state the obvious, sonically, BW reflected the sound of the engines and clatter of trains.

Pinning down the beginning of rock is nigh impossible.  For some odd reason there are people who want to say that Ike Turner's 'Rocket 88' is the first rock song, apparently due more to its popularity than its innovation.  Infectious dance beat and riotous swing?  Hell, the very earliest BW of the 20's has that.  Sexual suggestivity?  BW lyrics sometimes exhort the women to shake their asses.  Did BW walking basslines get replaced with something else?  Nope, early rock by Elvis still has standard BW basslines.  Electric guitars?  Now you're talking.  I would say that the earliest BW having an electric guitar is a good candidate for pioneer rock. That brings Western Swing (Bob Wills, Spade Cooley) into the fold.

The most insightful statement I've ever heard about the origins of Rock was by Robbie Robertson of The Band, in the film The Last Waltz.  It's's Louisiana gumbo and's blues...all of it.  It's genesis was in religious revival meetings and county fairs, where late into the night after the community leaders had gone home, you'd have musicians throwing these styles all together in unorthodox ways, and doing crazy stage antics (e.g. Chuck Berry's duck walk).

Something I love in the earliest rock is a particular sonic quality in the rhythm section that I call the Washing Machine.  The recipe seems to be double bass (acoustic), piano and drums with brush sticks.  When they're playing really tight, the three fuse into a single rhythm-machine timbre.  A wacky, anthropomorphized Washing Machine drawn by R. Crumb comes to mind.  You hear it on a lot of early Elvis tracks.

Friday, May 15, 2009

Solving Character Set Problems: ASCII, ISO-8859, WinLatin1 and UTF-8

Character Sets and Feeds

I work for a company which receives huge incoming text feeds from our clients, which we filter, transform and massage to go out to the major search engines. Much of the work of cleaning incoming feeds consists of replacing undesirable characters (e.g. fancy foreign characters and symbols) that come from the client. For search engines we have to provide the lowest common denominator of character type, plain ASCII. There are many ways these "bad" characters creep into incoming feeds. Some are mainstream ISO 8859 or UTF-8 characters that are accepted by most display systems, but are still undesirable for search engines. Less likely but still possible are errors caused by the use of a legacy non-printing control character. By far the most common offender comes from violation of ISO 8859 conventions that have come into practice, known as "Extended Ascii" (of which Microsoft's WinLatin1 table is the most notorious offender). This page explains the major character set types, along with a little history, and the problems that they cause to feeds.

7-bit Ascii (ISO 646)

The original ASCII table, a 128-value (7-bit) character set is formally known as ISO 646. In fact it remains the only real "Ascii table" (more on this later). This portion has survived unchanged as the beginning portion of all major character sets (even the ones tampered with by Microsoft).

Below is the printable portion of ISO 646, characters 32 through 127 (the end of the table).

Below is the non-visible characters section of ISO 646, the first 32 characters of the table. Note that only a handful are in modern use; the rest are historic relics.

In feeds from our clients, mistaken input of legacy control characters is rare, although recently one client reported seeing 0x1D ("Group separator GS Left Arrow") in a feed.

8-bit Character Tables and ISO 8859

Ascii remains to this day, strictly speaking, a 7-bit table of 128 characters and no more. However, since in modern times characters are rendered with 8-bit bytes, people have wanted to take advantage of what is available on their machine in the area of that top bit. Folklore, common practice, and whatever happens to be on the machine one is using, has mislead many into thinking some universal 8-bit ascii table exists. But in reality the 8th bit portion has been a playground for whatever people wanted it to be.

To answer this havoc, ISO 8859 defined a variety of 8-bit character tables cover most of the world's needs. For every ISO 8859 table, the 7-bit portion is ISO 646 Ascii. The best-known of the ISO 8859 tables, at least among the English speaking and Western European world, is ISO 8859-1, also known as "Latin 1." To some people, this is "the" Ascii table. The eight bit portion is shown below.

Note that there is an unused portion (greyed out) of the table between characters 128 and 159. Take special note of this; as you'll see below this special area becomes a huge area of controversy (Extended Ascii).

Other ascii tables in the ISO 8859 collection include Baltic, Cyrillic, Arabic, Greek, Hebrew and Turkish character sets, allowing each country to adopt the appropriate table, and software manufacturers to write compliant products. (You can see the various ISO 8859 tables here.)

At my company, a feed using ISO 8859 8-bit characters will need to be cleaned before going out to search engines. If they are ISO 8859-1 (Latin1) characters, the mapping to a substitute character is well-known and easily fixed. We do this with Perl scripts and a simple mapping table. Another possible problem would be if we had a client that is using some ISO-8859 table other than Latin1 (for instance, a client from Scandinavia might use the Baltic table, ISO 8859-4). In that case we would have to create a new mapping table. Worse yet, files cannot identify themselves as being any particular ISO-8859 subversion, so if a client doesn't announce the table they are using, we would only find that out by seeing unusual errors. For example, seeing frequent use of a copyright symbol © in unexpected places would be a clue that some non Latin1 ISO-8859 table was in use.

It's not clear why ISO 8859 prohibited this area. In one table I saw, each of the codes were described as being a control character of some sort.

'Extended Ascii' and Windows 1252

Microsoft couldn't resist using that empty area of the ISO-8859 tables, so they went and created their own 'extended version' of the ISO 8859 table set; you can see them all here. Their version of ISO 8859's Latin-1 table goes under the name WinLatin1 or Windows 1252. The 8th-bit portion of this table is shown below, with the characters Microsoft inserted into ISO 8859's empty area marked in yellow.

Due to the popularity of the Microsoft Office family of software products, WinLatin1 has achieved an unfortunate hegemony in the computer world. In the competition for the elusive "8-bit Ascii" table, WinLatin1 is ISO 8859-1's chief competitor. However, a sizable portion of the computer world (Macs and Unix flavored s

ystems) have settled on the real ISO 8859 standard, so problems occur when people paste, email or otherwise transmit text made from a Microsoft product to be displayed on a Unix or Mac box. When you see question marks or 'ascii garbage' in a file when you're not expecting it, you have probably been WinLatin1-rolled.

The term "Extended Ascii" has become the name to describe the format of any file which contains a mixture of ISO 8859 plus characters in the forbidden zone of bytes 128-159.

So what did Microsoft do with that precious block of 32 characters it stole from ISO 8859? Four characters comprise the notorious MS-Word 'smart quotes' (single and double: ', ', " and "). On the other hand they were remarkably prescient in including the Euro currency symbol (€) years before the currency was adopted. There are a number of punctuation and editorial marks or symbols, such as daggers († and ‡), ellipses (...), the list bullet (•) and several others that perhaps are used in European languages (ˆ, ‹, ›, and „). Many choices are totally inscrutable, however. Why define the comma-lookalike "‚"? Of what use can the few foreign alphabetical characters of Œ, œ (used in old German and old Romance languages), Ÿ (used in Greek transcription and rarely in French), Š, š, Ž and ž (Estonian, Finnish and Czech characters) be?

Use of the "forbidden zone" of ISO 8859 accounts for most of the cleanup tasks in clean scripts at my company. The trademark sign (™) is a favorite of many of our clients, since it is unavailable in ISO 8859-1. The list bullet and daggers are also occasionally found in feeds. Most likely the source of these characters is from client use of Windows Office software.

UCS/Unicode - ISO 10646

The next evolutionary step after ISO 8859 is UCS (Universal Character Set, ISO 10646). It dispenses with the approach of multiple table sets for individual languages in favor of a single giant table of all possible characters. Its table size of 2^31 (2,147,483,648) characters encompasses the character sets of virtually all the languages of the world. It goes by the more common name of Unicode, after the name of the consortium that merged with UCS.

The first 256 characters of Unicode are completely backward compatible to ISO-8859-1. Unicode's first 128 characters are classic ISO 646 7-bit Ascii set, while Unicode characters 128-255 is the top half of ISO-8859-1 Latin1.

File Formats vs. Byte Sequences

With a character set of Unicode's size, single byte sequences with a maximum numberic range of 0-255 can obviously no longer be the only way in which a file stores text. UCS has spawned two different byte-sequence conventions, UCS-2 and UCS-4. UCS-2 files have two byte characters, and UCS-4 files have 4-byte characters.


UTF-8 is a byte-based sequence convention for representing Unicode. It can have 1-, 2-, 3-, 4-, 5- or 6-byte long characters, changing when and where it needs to. It is an efficient solution for English and other Western European languages that spend much of their time in the first 128, one-byte wide unicode characters. In fact, English UTF-8 files will often consist of nothing but one-byte sequences, no different than an Ascii file. Because of this UTF-8 is nicely backwards compatible to ASCII, meaning a file can be UTF-8 but still work on older systems that support ISO-8859 or earlier standards. Even when a UTF-8 file does use larger width Unicode characters, it is still a "one octet encoding unit" (single byte) encoding standard, as opposed to UCS-2 and UCS-4 which are two- and four-octet encoding unit standards.

UTF-8 should always be spelled as shown. Referring to UTF-8 as utf-8, utf8, UTF8, etc. is considered bad form.

Unlike UCS-2 and UCS-4, which represent the higher order values of Unicode directly in binary with 16-bit and 32-bit data units, UTF-8 remains a "one octet encoding unit" (single byte). How does it represent Unicode values higher than 255? It encodes them with byte sequences of variable length. The values for these encoding bytes all lie within the range 128-255, perhaps not coincidentally the high bit portion of the 8-bit byte. In UTF-8, none of the values in the 128-255 range represent an actual character. This range is broken down to sub-ranges of bytes which are designated as "signifiers" for the beginning of 2-, 3-, 4-, 5- or 6-byte sequences. See the table below. Bytes in the red region, hexadecimal C2 through DF, are used to announce the beginning of a two-byte sequence; bytes in the blue region a three-byte sequence; and so on, to the orange region for 6-byte sequences. The characters in the purple region are the data values that can follow the signifiers. By using just these values, the complete set of Unicode characters that comprise two or more b yte s can be constructed.

The greyed out areas are prohibited values in UTF-8 (although there is an exception that will be discussed later).

For single byte characters, UTF-8 simply uses the first 128 Unicode characters, the ones that correspond exactly to classic Ascii 7-bit ISO 646. No special signifier announces a single byte sequence.

An easier way to understand the multi-byte signifier approach is by looking at the bit patterns in their binary values, as shown below. Two-byte sequences can be announced with with any byte having 110 as the top three bits; three-byte sequences can be announced with any byte having 1110 as the top four bits; and so on.

Cleaning a file of any UTF-8 multibyte sequences is complicated by the fact that you have to "look ahead" in the byte stream to determine whether a byte is the start of a multibyte UTF-8 sequence, or an isolated ISO 8859 character. When a UTF-8 sequence is found, the n-byte sequence must be replaced with a single ascii character (if the appropriate substitute is known), or simply removed (if no substitute is known). In practice, our clients use a very small repertoire of symbols that are easily replaced (trademark, registered, bullet, etc.).
If one wanted to to extract the Unicode sequence numbers that fall in the above 255 range from a UTF-8 file, reverse computation would be required, since the Unicode sequence numbers above 255 are encoded.


Working UTF-8 Sequences

Copyright symbol: 0xC2 0xA9

Not-equals sign: 0xE2 0x89 0xA0 (Unicode char U+2262)

Korean text: 0xED 0x95 0x9C 0xEA 0xB5 0xAD 0xEC 0x96 0xB4 (Unicode chars U+D55C U+AD6D U+C5B4)

Japanese text: 0xE6 0x97 0xA5 0xE6 0x9C 0xAC 0xE8 0xAA 0x9E (Unicode chars U+65E5 U+672C U+8A9E)

Working ISO 8852-1 (Latin1) Sequences

0x31 0x32 0x33 0xE6 0xD8 0xC6 (1,2,3, æ, Ø, Æ)

Mangled Sequences

0x31 0x32 0x33 0x99 (1,2,3, winlatin1 trademark)

0x31 0x32 0x33 0xE6 0xD8 0xC6 (1,2,3, three Latin1 chars, one winlatin1 char)

0xE6 0x97 0xA5 0xE6 0x9C 0xAC 0xE8 0xAA 0x9E 0xE6 0xE8 0x31 (UTF-8 Japanese text followed by misused signifiers)

How Do You Detect a File's Encoding?

None of the character tables discussed so far (with a possible exception of UTF-8, discussed below) have any convention for a self-identifying start block or header. It has to be done based on some sort of pattern. In Unix, the 'file' utility (or 'type' on some versions of *nix) is a pretty good tool for identifying the character table type. Here's the pattern that 'file' seems to use:

When: Unix 'file' Reports:
All characters are 7 bit "ASCII text"
ISO 8859 8-bit characters (with or without 7-bit ascii mixed in) "ISO-8859 text"
Any use of bytes in the ISO-8859 forbidden range (0x80-0x9F) which aren't part of valid UTF-8 sequences "Non-ISO extended-ASCII text". (This is how WinLatin1 files will be identified.)
Consistent correct use of multi-byte UTF-8 sequences "UTF-8 Unicode text"

How 'file' handles some edge cases:

When: Unix 'file' Reports:
All 7-bit, but containing odd combinations of control characters from the 0x00 to 0x1F range "Data"
Correct UTF-8 code mixed with any 8-bit ISO 8859 characters, forbidden or non forbidden "Non-ISO extended-ASCII text"

What 'file' cannot do is read your intentions. It makes the simplest judgment call it can. For example:

  1. A file of all 7-bit characters is always "ASCII text"...never mind that you think of the file as being UTF-8 or ISO 8859 compliant (both of which are also true).
  2. ISO 8859 detection is based only on byte values alone, not syntactic patterns; it can't tell whether you want it rendered as Latin1, Baltic, or Cyrillic, etc.). It reports ISO 8859, not ISO 8859-1, or ISO-8859-2, etc. This would take some fairly elaborate dictionary lookups from multiple languages to accomplish.
  3. Any UTF-8 multi-byte sequence not containing bytes 0x80 to 0x9F is hypothetically correct ISO 8859. Because these are syntactically rare, 'file' choses to identify this at UTF-8 (assuming everything else about the file is UTF-8 compliant). It cannot read your mind that you intended a series of weird ISO 8859 characters. For example you might want the sequence 0xB5 0xAD to read µ­ (micro sign, soft hyphen) in Latin1, but since it is a valid UTF-8 sequence (≠, not equals), 'file' will report it as "UTF-8 Unicode text" if everything else about the file is consistent with UTF-8.

Regarding the potential for confusion between ISO 8859 and UTF-8, this is what others have to say:

  1. "the probability that a string of characters in any other encoding appears as valid UTF-8 is low, diminishing with increasing string length" (UTF-8 RFC).
  2. "The chance of a random sequence of bytes being valid UTF-8 and not pure ASCII [sic] is 3.9% for a two-byte sequence, 0.41% for a three-byte sequence and 0.026% for a four-byte sequence." (Wikipedia UTF-8 page) (see )

If you're dealing with a page you suspect is not rendering with the correct ISO 8859 subtable, both Firefox and Safari give you the option to change the interpretation. In Safari, check the View menu, Text Encoding. In Firefox, check the View menu, Character Encoding.

BOM: A UTF-8 File Format?

There is a convention called the "Byte Order Mark" (aka BOM) which (hypothetically) can be used to self-identify a UTF-8 file. This consists of beginning a file with the sequence 0xEF 0xBB 0xBF. Both Unix 'file' and 'vi' seem to be pretty good at interpreting it. When a file is pure 7-bit ASCII, but preceded by the BOM, it forces 'file' to report it as "UTF-8 Unicode text". An interesting edge case: if a file containing ISO 8859 8-bit characters that cannot be interpreted as UTF-8 multi-byte sequences is preceded by the BOM, neither Unix 'file' or 'vi' can be fooled: 'file' will report the three BOM characters as "ISO-8859 text", and 'vi' will display the BOM characters (as ).

Othan than BOM, refrain from ever calling ASCII, UTF-8, ISO-8859 or WinLatin1 a "file format." None have a standardized header or starting block They are more properly called "byte sequence conventions."

Unicode HTML Entities

The entire corpus of Unicode is hypothetically available to browsers. For example the Unicode characters of Japanese that were mentioned earlier, U+65E5 U+672C U+8A9E, should successfully render on your browser when I put &#x65E5 ; &#x672C ; &#x8A9E ; in the HTML. Here it goes: 日 本 語. These work for me on Firefox. In practice, I believe that no browsers are 100% capable of correctly rendering the entire 2^31 Unicode character set via HTML entities.

Question: does the presence of the foreign characters you see above effectively make this page Unicode or UTF-8 format? No. The characters I used to make them were merely ampersand, pound, x, and digits, all 7-it ASCII characters. HTML Entities are merely instructions to browsers what to do with the characters.

Do we want to use HTML Entities in the cleaned feeds we send to search engines? Do we want to figure out the intent of various WinLatin1 manglings and UTF-8 sequences that come from client feeds and turn them into HTML entities that will render safely on a browser? This is a subject still open to question. Our YSSP search engines can handle them, while our Comparison Shopping Engines (CSE's) do not. Currently we are leaning towards aiming for the lowest common denominator, 7-bit ASCII for all search engines.
WinLatin1's tragic legacy has, unfortunately, been immortalized in HTML entities. HTML entities 128 through 159 will render as their corresponding WinLatin1 characters in many browsers. (Since Unicode entries 128-159 are prohibited, HTML entities are departing from their Unicode basis in this area.) However, you should not consider this reliable; the W3C HTML 4 recommendations exclude these codes from their list.

HTML Page Encoding

Can an HTML file be in actual UTF-8 format? "Yes we can." If it contains UTF-8 byte sequences, it gets interpreted by your browser as UTF-8. Here are two nice examples of "UTF-8 Test pages" out there that are just full of exotic Unicode in UTF-8 encoding: Markus Khun's sample file, and Frank da Cruz's UTF-8 Sampler. (How well your browser shows the page depends on how good the browser is. For me, Firefox and Safari got most of it, but IE6 failed on large numbers of the characters.)

After you look at one of these pages, try viewing source. What do you see? If your source viewer is UTF-8 capable, you'll see, well, the UTF-8 characters. Surprised? Expected to see the numeric bytes, did you? If you want to see the numeric value of a character under the cursor in vi, press ga. For UTF-8 characters higher than 8 bits you will see large values like 0xFFDD. Or if you want to get really deep into the binary encoding of a file, you could get a binary editor. On Unix, try 'od', or better yet, 'bvi' (you have to download it).

When an HTML page puts this in the block, what does it do?

"Content-Type" content="text/html; charset=UTF-8" />

For those most part, it simply communicates intent: a tip to the consuming client what to do with the page. It doesn't change the way the bytes go down the wire, or the way your HTML is stored locally on your disk. It's still just bytes. In practice, you can even leave the declaration out and even though the file is chock full of UTF-8 binary sequences, a good browser figures it out anyway. (Note that Frank da Cruz's file above had no meta direction at all.)

How do you compose (i.e. edit and save) an HTML file in UTF-8? Well, if you're using all 7-bit ascii characters you're fine, that's still valid UTF-8. But really we mean UTF-8 with higher level Unicode characters. Well, unless you want to enter the binary directly, you'll need a special editor for that; vi, textpad, etc. won't do it for you. Try Google.

UTF-8 and XML

XML is UTF-8 by default; the following declaration is actually redundant:

 "1.0" encoding="UTF-8"?>

You'd have to override it with a different value for the encoding attribute to make it non UTF-8. Like the HTML meta tag content charset setting mentioned above, the encoding declaration is essentially a tip to the consumer of the XML what to expect.

Some Fun Historical Facts

Bob Bemer (1920-2004) is said to be the "father of ASCII". His Ford Explorer's vanity plate read 'ASCII' surrounded by a plate holder reading "Yes! I'm the Father of".

Wikipedia has a picture of one of the earliest published ASCII tables, from 1968.

Ken Thompson, the Unix pioneer, invented UTF-8. "It was born during the evening hours of 1992-09-02 in a New Jersey diner, where he designed it in the presence of Rob Pike on a placemat" (source). You can also read the original AT&T UTF-8 Tech Report.

Sunday, May 10, 2009

You on a Diet: book review

During my various attempts to get in good shape over the years I've wasted my share of money on various diet & exercise books that try to tell you what exercise you should do, what muscles they benefit, what foods you should eat and why, and recipes that will make it more enjoyable. Big surprise, lack of willpower has tanked many of my efforts to stay on any kind of healthy eating program. But I really don't think that willpower is the whole story for me or other dieters. What about our ingrained habits of using food for reward, food to fix a poor mood, hedonistic eating, addiction to various kinds of foods (like my favorite, the Sticky Pecan Roll from Au Bon Pan)?

One diet approach that I thought took a step forward in talking about the psychology of diet was the Low-Carb diet (or "Atkins" in its most popular form). It talks about the role that carbs have in making us feel full, and thereby feel good. But I think you could do a lot better. I want to know the mechanics behind appetite, satisfaction and digestion: what actually occurs in the body to make you want to eat huge portions of your favorite foods, what happens as that urge becomes satisfied, and how you can use that knowledge to change your behavior.

I discovered the book You On a Diet while standing in line at a Jamba Juice one day. Using entertaining illustrations, it nails exactly what I was looking for. Some of the points in makes on the mechanics of body chemistry, food and appetite include:
  • The lower gut contains 95% of the body's serotonin, which suggests that eating is self-medicating.
  • Our caveman ancestors were more fit than us because the stresses of survival kept them lean. Stress, like fight-or-flight stress, means that Peptide NPY is inhibited and you don't feel like eating.
  • When fat makes the liver work extra hard, it prevents glucose from getting to our cells, and produces hunger.
  • Fiber is good for diet because it slows "the transit of food across the ileocecal valve, keeping your stomach fuller for longer." (p. 68)
The book identifies these foods that make you feel full, or suppresses appetite in some way:
  • Nuts
  • Cinnamon
  • Whole Grains
  • Fiber
  • Red Peppers
  • Smell of grapefruit (p. 88)
  • Brightly-colored food
  • Mint breath strips (p. 239)
  • Fiber supplements (e.g. 1 tbsp Psyllium Husks with a glass of water)
Other eating strategies the book recommends revolve around easing the process of digestion. When the intestines are breaking down food, separating the good nutrients (to go into the bloodstream) from the bad nutrients (to be eliminated), a natural process of inflammation occurs. The intestines do their job well when that inflammation is kept to a healthy level. Bad foods increase the inflammation, and the separation of good vs. bad is impaired. Inflammation-fighting foods include:
  • Lactobaccillus CG, a healthy bacteria found in yogurt
  • Omega-3 fatty acids, such as Fish oil, walnuts
  • Green tea
  • Beer (hops)
  • Tumeric
  • Jojoba beans (available in supplements)
  • Soybeans (isoflavins)
  • Lignans, such as Flaxseed oil, rye bread
  • Polyphenols, such as tea, coffee, fruit, vegetables
  • Glucosinulates, such as broccoli, kale, cauliflower
  • Carnosol, found in Rosemary
  • Resveratrol, found in red grapes, juice and wine
  • Dark Cacao
  • Quercetin, as found in cabbage, spinich, garlic, capers, apples, tea, red onion, red grapes, citrus fruit, tomato, broccoli, leafy green vegetables, cherry, raspberry, and lingonberry
  • Antioxidants, as in vegetables and fruits (especially bananas), Vitamin E, Vegetable oils, Tea, coffee, soy, fruit, olive oil, chocolate, cinnamon, oregano and red wine
The book covers some interesting facts on the mechanics of exercise and weight loss:
  • Weight loss improves cholesterol by a factor of three. For example, a 7% weight loss leads to a 20% improvement in cholesterol levels. (p. 120)
  • The beneficial effect of exercise in producing weight loss is greater than the detrimental effect of eating in producing weight gain. So even if you're getting in only a little exercise each day, the effect is significant. (pp. 141-142)
  • Without exercise, we lose 5% of our muscle mass every 10 years after the age of 35. If you don't exercise (rebuild muscle) every 10 years, you need to eat 120-420 fewer calories a day to maintain your current weight. (p. 142)
  • Focus on reducing the size of your belly, not weight loss. Exercise not only reduced fat but bulks up muscles, which can result in a net loss of zero.
The material on "good fats" vs. "bad fats" is helpful:

  • Bad Fats are the ones that stay solid at room temperature: animal fat, butter, stick margarine & lard. Food manufacturers push these because they have a long shelf life.
  • Good Fats are the ones that are liquid at room temperature are the good, omega-3 and -6 fats: olive, vegetable, sesame & canola oil, fish oil, flaxseed, avocados, nuts (especially walnuts). Nutrients that fight bad fats are: niacin and vitamin B5
Everyone know that whole grains are good for you, although we use the term so frequently it's helpful to review what this actually means. In a whole grain, "the grain still has all three of its original elements: the outer shell or gran, which contains fiber and B vitamins; the germ, which contains phytochemicals and B vitamins; and the endosperm which contains carbohydrates and protein." (p. 256) "'Refined' grains means only endosperm is enclosed." (p. 257)
Here's a fun subject the book covers: what gives you gas? It's important because it's a byproduct of the way you eat and what you are forcing digestion to do for you:

  • Gas is a normal result of the intestinal inflammation during digestion, as good nutrients go to the bloodstream and bad nutrients to to the lower intestines (and produce gas). So bad gas can be attributed to too much bad foods in your diet. Also, when inflammation is too high, some bad nutrients get into the bloodstream, leading to cholesterol.
  • Sulfur-rich foods such as eggs, meat, beer, beans and cauliflower make gas smell worse
  • Drinking cola means swallowing more air, which means more gas
Other interesting things I found in the book:

  • No evidence yet shows that artificial sweeteners are unhealthy (p. 97)
  • It's the fat around our waist that gets us into trouble. Fat in other parts of your body cause relatively little harm to your health or eating chemistry. (p. 102)
  • Alcholic drinks fight bad fats (p. 123)
  • The liver is the heaviest organ in the body (p. 77)
  • Your deoderant can make you gain weight, if it contains aluminum or polychlorobiphenols (p. 92)
  • The more brightly colored the food, the better it is for you (p. 95)
So I can give a strong recommendation for You On a Diet if want to learn alot about the mechanics behind appetite, satisfaction and digestion.

Sunday, April 12, 2009

iCrossing is hiring a Java developer in Chicago

My employer, iCrossing, has opened a search for a new member for the Merchantize team. Here's the description. To apply for the position, visit, select U.S. Career Offerings, Jobs by Location, then Jobs in Chicago, IL.

Java Software Engineer (Open Source / Web Analytics / ETL)


We’re a people business.

People are the heart and soul of our company, working every day to make our clients’ marketing programs successful.

At iCrossing, we combine experienced talent with world-class technologies to efficiently create marketing programs that truly perform. With more than 620 professionals in 15 offices in the U.S. and Europe, we are equipped to service the digital marketing needs of large enterprises and growing companies alike.

We’re seeking the talented, the experienced and the exceptional to give our clients the most creative and successful solutions for an ever-changing industry. When we find them, we offer a dynamic working environment, competitive compensation, the opportunity to work on exciting client programs, and occasional bagels.
We are seeking a highly motivated and technically proficient JEE Software Engineer / Software Developer to work on our industry leading and mission critical Paid Media Management (Search Engine Marketing, bid management) product.

Features of the position:
• Work on a high-visibility, high performance product that supports iCrossing’s industry leading SEM practice in a growing and fast moving industry.
• Work closely with all of the major search engines (Google, Yahoo, MSN, Ask, AOL) and their APIs.
• Work in a fast moving and forward thinking development environment that is constantly researching and rapidly implementing the latest technologies.
• Research and participate in the advancement and implementation of open source frameworks and architectures such as SOA/ESB, MapReduce, Grid and Cloud computing, and others.
• Work with an experienced Agile Software Development team in a highly collaborative environment.
• Modern Java Enterprise open source based product stack, Java 6, Spring, Hibernate 3, Webworks/Struts 2, JMS, JUnit, MySQL and more.
• Learn current software development best practices (continuous integration, build automation, test driven development, pair programming, agile estimating and planning, etc)
• Apple MacBook Pro, 24” widescreen monitor, IntelliJ or Eclipse.
• A casual, fun, and creative work environment
Major Job Responsibilities / Accountabilities:
• Write test driven quality code.
• Work closely with your dev team.
• Follow and encourage development best practices.
• Develop knowledge of Search Engine Marketing (SEM) principles and techniques.

Required Technologies (At least one or more of the following)
• Spring
• Hibernate
• SQL scripts
• Shell Scripting
• Webwork (Struts 2.0)
• Linux / Unix admin
• Junit (required) or TDD (preferred)
• Grid Computing (GridGain preferred)

Bonus Technologies (Preferred any of these)
• MySQL (especially advanced knowledge of replication, storage engines, backup and recovery)
• Data warehousing design concepts, ETL
• Mondrian OLTP
• Amazon EC2 / S3 / AWS

Knowledge / Skills / Abilities:
• BS in Computer Science or equivalent level of experience
• Understanding and/or appreciation for Agile software development methodologies.
• 1+ yrs of professional development experience.
• Familiarity with source control using Subversion
• Familiarity with IDE tools such as Eclipse or IntelliJ
• Must possess effective interpersonal and communication skills and ability to work successfully in a team environment.
• Good organizational and time-management skills.

Do Not Apply if you:
• Do not know Java
• Have no interest in Agile, TDD or Unit testing
• Are close-minded and don’t want to learn new technologies.
• Are more comfortable working on the same technology you did last year.


Friday, March 27, 2009

Hacking a Linksys NSLU2

I bought a Linksys NSLU2 a while back, which is a low cost (about $99) appliance for turning two USB disk drives into a Network Attached Share (NAS) system. This lets me set up file storage centrally located on my LAN (as opposed to attaching it to one computer on the LAN and setting up a share).

What's inside is a small Linux computer mounted on a single circuitboard. And that's where the fun comes in. As the device's Wikipedia page points out, since the internal Linux is licensed with a GNU General Public License, Linksys was required to release their source code. This has enabled third parties to develop firmware upgrades to the device. One popular upgrade is the Unlung SlugOS, which among various things, enables the device to accept telnet connections.

Here is my network cabinet at home. From left is my DSL modem, a 360 GB USB disk drive, the NSLU2, and my Linksys WRT54G Wireless router. If you know this router, you can see by the size that the NSLU2 is not much bigger than a deck of cards.

Like a router, an NSLU2 hooked up to your LAN will have its own web administration page, which is reachable by

Upgrading to the Unslung firmware went exactly as the directions described it. After restarting, the NSLU2's admin page had a few additions, as you can see in the screen grab below. It added an unslung logo on the upper left, and a "Manage Telnet" link on the right. Once I enabled telnet I was able to log in and get a prompt by telnetting to

A short tour of what is inside the box is shown in the telnet session below. There's cool stuff inside! A nice handful of basic unix commands, the web server that serves up the admin site (above), and even a wget command (which I demonstrate by getting the homepage HTML.

The next thing I wanted to do is to "de-underclock" the device. The CPU is 266 mHz but for unknown reasons, Linksys clocked it down by half with a tiny little resistor. Here's the board:
Simply removing the resistor clocks it up to 266 mHz. Following the helpful instructions on the Unslung site, I geared myself up with some needle nose pliers, a static wrist guard and gloves, a magnifying glass and a geeky miner's light (see pic at left). All I had to do was get a grip on that tiny, tiny resistor (about 1/4 the size of a grain of rice)...and I crunched it! After all, I wasn't going to need it again.

When I put the card back inside the case, reconnected it and restarted worked! And I got the proof the the clock speed doubled in my telnet session:

Whoa, it says it is 2.22 mHz short of 266 Hz, I wonder why? Such are the mysteries of computer hardware.

Tuesday, March 17, 2009

Apple iPhone OS 3.0 Announcement summary

Here are highlights of what was announced for the iPhone OS 3.0 release early this afternoon from Apple:
  • Cut and paste: it was worth the wait, the touch interaction to do this looks very cool (see picture at right). Works across applications and does undo.
  • Multimedia messaging: you can attach a picture to a text message
  • Ability to choose a group of photos and send them in a single email
  • Push email notification
  • Landscape mode text entry (so what)
  • Turn-by-turn GPS navigation.
  • Available in the summer. That's as detailed as it gets. No doubt will be linked to the new iPhone model coming out in July.
  • Virtually all of the new features will work with the original, pre-3G iPhone (exceptions: multimedia messaging and stereo bluetooth)
  • Peer-to-peer linkups between individual iPhones for games, file sharing, etc. This exists now with things like AirSharing and Holdem, but those companies probably rolled their own; now it's part of the API
  • API support for applications that connect to external devices. Demonstrations with medical devices were given (see pic at right). Medical applications of new technology are always a big win in corporate presentations; the real news here is that this will open up remoting of all sorts of sophisticated devices for music, video, information systems, anything you can imagine.
  • Ability to search in your emails on the server side, and search in your calendar items
  • Search your iPhone contents with Spotlight (well-known to Mac users)
  • The Sims 3 will run on the iPhone (see pic at right)