Tuesday, October 31, 2006
The cry always went out for more machines, which arrived "clean" and ready to go. Of course, after a testing cycle the new machine could end up just as messed up as all the others. Later, when the next release rolled around it was "we need more machines" all over again.
Reminds me of the old tale about the workbench with 25 screwdrivers and a set of storage slots for the screwdrivers, but only 24 slots. In the beginning each screwdriver was in its appropriately marked slot. Except #25 which someone borrows. Then another worker borrows #15. When the first worker returns, he puts #25 into the only open slot, #15's slot. Then he borrows #7. Now the second worker returns and puts #15 into slot #7. Pretty soon all the screwdrivers are in the wrong slots. The whole storage system fell apart because of a single insufficient slot.
Monday, October 30, 2006
Yes we still use Exchange, Visual Studio, and GotoMeeting.com. But in five years?
Tuesday, October 17, 2006
Then after XP they pinned the needle the other way, and a rewrite orgy began. It produced some good things. XAML is good. They completely rewrote Window's TCP/IP stack; which is quite an act of chutzpah, and now it supports multi-core CPUs in useful ways. But overall they ended up suffering exactly the sort of problems that a Big Bang approach would bring.
In a way it's a smart move though. Most people are quite happy with XP. Any next version had to be a long-term, next-generation thing. People will love Vista in 2009.
Wednesday, October 04, 2006
Two rules of thumb:
- Make a guess and double it. I find this one surprisingly accurate. Probably because I tend to work on the same sorts of projects; but then don't most programmers!
- Brook's book The Mythical Man Month says that every app has a core set of functionality that can be done in X weeks. To create a product will take 9X because of all the wrapper code for config/reporting/error handling, and all the user documentation that needs to be provided. People constantly underestimate this factor of 9.
The whole game of making & meeting estimates is about what features are "in" and what's "out". I like to tie estimates to an explicit list of what features users will and won't get. Requirements are bound to change, and managers sometimes have a habit of forgetting that your estimate was tied to a feature set.
As usual, Joel has written on this already.
Wednesday, August 23, 2006
Tuning is time-consuming because a human has to listen to the audio recordings of hundreds of phone calls. If the system can log speech rec errors, then the person doing the tuning can zero in on those utterances. If not, and we will see how this can happen, tuning becomes very difficult.
Out-of-grammar errors are the easiest to detect because the speech recognizer returns a NoRec error. These may be due to coughs, background noise or other audio problems that grammar changes can't fix. Others are due to variations in pronounciation that your grammar doesn't know about. People's names are notoriously hard this way because: (a) names are multi-lingual in origin, and (b) many people will mis-pronounce a name they've only read. We had a Mr Biber here, pronounced bee-ber, but many people would say bye-ber. Tuning revealed this, as well as the fact that some people would say only the last name. The grammar needed to be changed to cover all these possibilites.
The more difficult problem is what we call the Rumplestiltskin error. Most speech engines want to recognize. Try saying "Rumplestiltskin" to a speech-rec auto-attendant. You'll be suprised at how often it will find a (wrong) match. Saying something completely outside the grammar will cause a medium-confidence recognition of some wrong word in the grammar. Of course you can affect this by raising the recognition and confidence thresholds, but that may cause other unwanted recognition problems. Confirmation, for example, is an added step in the dialog that callers can grow to resent. The problem with a Rumplestiltskin error is that it's invisible; the system really has no knowledge that a recognition error has ocurred. This makes finding the error a time-consuming task for the tuner. The caller may be saying an acceptable alternative for a word, but until we discover this and add it to the grammar, the system will not be pleasing this caller.
Tuning becomes a big problem when dyamic grammars are used, because you can't do any pre-deployment tuning. The app will read a list of phrases from a database, say Little League team names, and generate a grammar. Consider the "Nowell Gnats". It's not unusual for the TTS engine to pronounce a word differently from what the speech rec engine accepts as a pronounciation. This is bad because your prompt may say it as "you can say Nole Nats..." whereas the speech rec engine wants "Now-well Nats".
It's easy to end up with a situation where the app won't accept any sensible pronounciation for a word, and even tells callers the wrong pronounciation to use! All without generating any recognition errors. Silent failure.
Can the speech rec platform help? Yes, here are a couple of ideas. First, the app should have "tuning mode" that can be set temporarily. It increases reco thresholds to force more confirmations, and logs all rejected confirmations as candidates for tuning. Secondly, the system should have a batch process that studies patterns of calls. If the same person calls back in several times in a row and never seems to complete a transaction, then these calls are also tuning candidates.
Tuesday, August 08, 2006
I have fairly mixed views about this move. First, the negatives. Yet again, Microsoft is making a mid-course correction with their speech offering. At the beginning the vision was multimodal speech. Then it was telephony-based speech rec by web developers (by sprinkling "SALT" on their web pages to speech-enable them). Then it telephony-based speech rec using more standard methods, such as VoiceXML. Now it's speech rec as part of an enterprise information system. For developers actually trying to build applications, all these course corrections are unnerving, to say the least. If the overall goal, as Microsoft says, is to create an "ecosystem" then they need to realize that stability is a key factor. Wandering climate change isn't helping.
Now the positives. OCS is a strategic product for Microsoft. Communication is becoming a key part of every organization. Communication across devices. Synchronous and ayschronous communication. Features such as: presence, IM, VoIP, video, Find-Me, Follow-Me, and ad-hoc conferencing. Think EMail++. For speech rec to be bundled into such a strategic product is a huge win. It also shows that the speech folks at Microsoft have been listening and learning. Changing course often works out better than not changing course!
Lets hope the transition to OCS is as seamless as possible for existing MSS speech developers.
Turns out it was a Longhorn bug with its audio software. Today Rob did the demo again, at SpeechTek no less! Eight minutes long and it worked fabulously. It was a nice comeback and a gutsy move -- way to go guys!
Friday, June 16, 2006
Charlie Wilson was a Dem senator from Texas. Most of the book deals with his efforts in Washington to get funding and support to his beloved mujahadeen. Washington is revealed not as a massive beurocratic system, or machine; but as a personality driven clique. One person, with the right skills (and Mr Wilson seems to have been a master), can find and grab the levers of power over literally billions of dollars in funding.
The other feature is the Alice-in-Wonderland nature of the world of the 1980s vs today's post-911 world. Then it was the Soviets who invaded a muslim country and faced a fierce insurgency. The US supported the insurgency at the urging of a Democratic president.
They supported (and were cheered) by fundamentalist jihadis, giving them secret weapons for attacking convoys and shelling bases.
One interesting question regarding today's Iraq is the lack of a Stinger missile. It made a huge difference in the Afghan war; as it would in Iraq. Strange that twenty years on there aren't tons of Stingers and its imitations available on the black market.
A movie is being made of the book.
MSDN is great if you already know the name of the class or function that you're interested in. But if you don't, then it can be very hard. Date formatting, for example, is under string.Format() instead of under DateTime.
C# Cookbook is goal-driven. Each topic, such as "Writing a TCP Server", "Increasing StringBuilder Performance" or "Creating a Priority Queue" is a full solution to a programming problem. It covers C# 2.0 so generics and anonymous methods are covered.
Put it beside your desk. Read ten snippets a day, and in a month you'll be a better programmer.
Wednesday, June 07, 2006
The kids couldn't believe we were planning to eat in this low-ceilinged room with creaky floors and exposed heating ducts. My older son couldn't get over the menu. He kept asking "But there's only two items on the menu?". Yes, I'ld reply: chicken or ribs. "Just two things!?" and "Is that really it?" For someone used to the multi-page menus of a typically eatery, it blew his mind.
Son #2 is precise and his criticisms were detailed: the ribs had too much sauce, the fries were too long, and "this bun sucks!". Summing up, he concluded that "at least a crappy restaurant could have better lighting". Needless to say this sort of side commentary dulls the enjoyment of the others at the table. But we all got through it, and the food is actually pretty good. A broadening experience; for when I asked days later if they had told their friends about the Bar-B-Barn, they said "it wasn't that bad."
Friday, May 26, 2006
Whoever thought up the idea of human-readable message protocols should get a Nobel Prize. Protocols like SIP, HTTP and RSS are so much easier to troubleshoot because of this. The bad old days of RPC, ASN.1, and ISDN are hopefully nearly gone.
Of course, security may force encryption of the human-readable protocols : so we may end up back where we started...