Category Archives: Code & Development

Random? That’s a Coincidence…

A randomly selected image this morning - my old VW Eos
Camera: Panasonic DMC-GX7 | Date: 30-01-2015 17:00 | Resolution: 4894 x 3059 | ISO: 320 | Exp. bias: -66/100 EV | Exp. Time: 1/60s | Aperture: 5.0 | Focal Length: 17.0mm | Lens: LUMIX G VARIO PZ 14-42/F3.5-5.6

My programming project of the last few weeks has been to build my own “rolling portfolio”, which shows random images from my photographic portfolio as either a screensaver or a rolling display on a second monitor. I’ve implemented a number of features I’ve always wanted but never had from freeware/shareware options, like precise control over timing, the ability to quickly add a note if I see a required correction, and the ability to locate and review recent images if someone says “what was that picture you were just showing?”.

Having previous blogged about the poor quality of “random” algorithms in Android music player apps (see  How Hard Can It Possibly Be?), I decided to put my money where my mouth is, and write my own preferred random algorithm. This does a recursive, random walk down the selected folder tree, until it either finds an image file, or a dead end (and then tries again). This was refreshingly easy to implement, and as expected runs quickly without needing any prior indexing of the content.

Also as expected, the simplest implementation returned a disproportionate number of hits (and therefore a lot of repeats) from folders with a very small number of images, but that was easily fixed by adding a “weighting” at the second stage of the walk, to reduce the number of hits on smaller portfolios.

Job done? Maybe. I started to notice that I still see the same image selected twice in quick succession, and sometimes more than twice over a day or two. At first I thought this might be an issue with seeding the random number generator, so that I was re-generating the same random sequences, but a quick check confirmed that wasn’t the problem. The next most obvious possibility (to me!) was an issue with the Microsoft .Net random() function, so I added some logging to the app, recording each random number, and then fed a day’s worth through some frequency analysis in Excel. That got Microsoft off the hook with a clean bill of health: there’s a slight preponderance of zeros, which I can explain, but otherwise the spread of results looks fine.

At the same time, I also added logging for the selected images themselves. In yesterday’s work hours operation the screen saver showed 335 images, of which no fewer than 21 were duplicates. Given that I have over 3500 images in the portfolio, this seems very high, but maybe not.

This is a known problem in mathematics, a generalisation of the “birthday problem”. It’s so known, because a common formulation is the question “given a room of people, what is the probability that at least two have the same birthday?”. While you need at 367 people to guarantee a duplicate, the counter-intuitive result is that with just 23 people in the room, it’s more likely than not. The generalised equation for the solution is the following:

E = k – n + n(1 – 1/n)k

In this n is the number of items, k is the number of random selections, and E is the expected number of duplicates. Feed in k = 335 and n = 3500, and you get the outcome E = 16. That’s close enough to my observed value of 21 (this is all random, so any one measurement might be either side of the expected value, but the order of magnitude is right). Couple this with the way my mind works, looking for patterns, and I must therefore expect to see some repetition. However it’s clear that the algorithm is working fine, it’s just the normal workings of probability.

Another implication of this is that as the sample grows, some images will naturally appear several times, and others may not appear at all. If we take 3500 samples, the expected number of duplicates rises to over 1200, so over 1/3 of the images will still be unselected.

Do I fix this? The relatively simple resolution is to keep a list of selected images, and use that to discard any selections which are repeats during a given period. However I would rather run this without a data store and maybe, now I can explain it, I’m comfortable. Time will tell.

View featured image in Album
Posted in Code & Development, Photography, Thoughts on the World | Leave a comment

That Was Too easy…

There is an old plot device, which goes back to at least Homer, although the version which popped into my head this evening was Genesis of the Daleks, a 1970s Dr Who story. A group of warriors fight a short but intense battle, and appear to triumph. In Dr Who, the Kaled freedom fighters burst into Davros’s headquarters and think they have dispatched him and his dalek bodyguards. Just as they are starting to celebrate, one of them, typically an old, grizzled soldier who has been round the block a few times, says "Have your instincts abandoned you? That was too easy." True enough, a few seconds later the elaborate trap is sprung, and the tables are turned.

Android 8 is like that. Not that it’s in the service of a malevolent genius, although I’m beginning to wonder, but it lulls you into a false sense of security, and then throws some significant challenges at you.

I got a new phone last week. I have loved my Sony Experia XA Ultra which I have used for the last two years, but been constantly frustrated by the miserly 16GB main memory. The Experia XA1 Ultra is an almost identical device, but with a decent amount of main storage. I had to forgo the cheerfully "bling" lime gold of the XA, replaced by a dusky metallic pink XA1, but otherwise the hardware change was straightforward.

So, initially, was the transfer. Android now has a feature to re-install the same applications as on a previous device, and, where it can, transfer the same settings. This takes a number of hours, but seems to work quite well. I had to manually transfer a few things, but a couple of hours in I worked through the list of applications, and most seemed to be in order with their settings. I could even see the same pending playlist in the music player which, after a lot of trial and error, I installed to randomly play music while I’m on the bus.

The new version of the Android alarm/clock app seems to be complete b****cks, and more trouble than it’s worth, but there’s no barrier to installing the old version which seems to work OK. My preferred app to get Tube Status updates is no longer available to download, but I could reload the old version from a backup. So that was most of the problems in the upgrade dealt with.

My instincts had abandoned me. It was too easy…

I had also forgotten Weinberg’s New Law. ("Nothing new works")

I got to the gym, and tried to play my music, using the standard Sony music player. Some of it was there, but the playlist I wanted wasn’t. I realised the app could no longer see WMA files (Windows Media format), which make up about 95% of my collection. A bit of googling, and it turned out the recommendation was to install PowerAmp, which I did, and it worked fine.

Then I got on the bus, and tried to play some randomised music. Nothing. The app had the files in its playlist, but couldn’t find them. I rapidly confirmed that the problem again was WMA files, which had suddenly become "invisible" to the app. After yet more trial and error installing, the conclusion is that it’s the Android Media Storage service which is at fault. Apps which build their own index (like PowerAmp) are fine. Apps which are built "the proper way" and use the shared index are screwed, because in the latest version of Android this just completely ignores WMA files.

Someone at Google has taken the decision to actively suppress WMA files from those added to the index. This isn’t a question of a problematic codec or similar – they had perfectly good indexing code which worked, and for some reason it has been removed or disabled. I can only think it’s some political battle between Microsoft and Google, but it’s vastly frustrating that users are caught in the crossfire.

I trust Dante reserved some special corner of Hell for those who break what works, for no good reason. If his spectre wants a bit of support designing it, I’ll be glad to help.

And I’ll resist saying "that wasn’t too bad" when I upgrade my technology…

Posted in Android, Thoughts on the World | Leave a comment

Inferring Algorithms: How Random is Your Music Player?

“You’re Inferring that I’m stupid.”

“No, I’m implying that you’re stupid. You’re inferring it.”

– Wilt, by Tom Sharpe

My latest contract means spending some time on a bus at each end of the day. The movement of the bus means it’s not comfortable to read, so I treated myself to a nearly new pair of decent BlueTooth headphones, and rediscovered the joys of just listening to music. I set the default music player app to “random” and let it do its stuff.

That’s when the trouble started. I started thinking about the randomisation algorithm used by the music player on the Sony phone. I can’t help it. I’m a software architect – it’s what I do.

One good music randomisation algorithm would look like this:

  1. Assign every song on your device a number from 1 to n
  2. When you want to play a random song, generate a random number between 1 and n, and play the song with that number.

However in my experience no-one ever implements this, as it relies on maintaining an index of all the music on the device, and assigning sequential numbers to it. That’s not actually very difficult, given that every platform indexes the music anyway and a developer can usually access that data, but it’s not the path of least resistance.

Let’s also say a word about generating random numbers. In reality these are always pseudo-random, and depending on how you seed the generator the values may be predictable. That may be the case with Microsoft’s software for picking desktop backgrounds, which seems to pick the same picture simultaneously on my laptop and desktop more often than I’d expect, but that’s a topic for another blog, so for now let’s assume that we can generate an acceptably random spread of pseudo-random numbers in a given integer range.

Here’s another algorithm:

  1. Start in the top directory for the music files
  2. Pick an item from that directory at random. Depending on the type:
    • If it’s a music file, play it. When finished, start again at step 1
    • If it’s a directory, make it your target and redo step 2
    • If it’s anything else, just repeat step 2

This is easy to implement, runs quickly and plays nicely with independently changing media files. I’ve written something similar for displaying random pictures on a website. It doesn’t require maintaining any sort of index. It generates a good spread of chosen files, but will play albums which are alone under the first level root (usually the artist) much more than those which have multiple siblings.

My old VW Eos had a neat but very different system. Like most players it could work through the entire catalogue in order, spidering up and down the directory structure as required. In “random” mode it simply calculated a number from 1 to approximately 30 after each song, and used that as the number of songs to skip forwards in the sequence.

This was actually quite a good algorithm. As well as being easy to implement it had the side-effect of being at least partially predictable, usually playing a couple of songs by the same artist before moving on, and allowing a bit of “what’s next” guesswork which could be entertaining on a long drive.

So what about the Sony music app on my phone? At first it felt like it was doing the job well, providing a good mix of genres, but after a while I started to become suspicious. As it holds the playlist in a readable form, I could check that suspicion. These are key highlights from the playlist after about 40 songs:

  • 1 from ZZ top
  • 1 from “Zumba”
  • 3 from Yazoo!
  • 1 from Wild Cherry
  • 1 from Wet Wet Wet
  • Several from “Various Artists” with album titles like “The Very Best…”
  • 0 from any artist filed under A-S!

I wasn’t absolutely sure about the last point. What about Acker Bilk and Louis Armstrong? Turns out they are both on an album entitled “The Very Best of Smooth Jazz”…

I can also look ahead at the list, and it doesn’t get much better. Van Morrison, Walter Trout, The Walker Brothers, and more Wet Wet Wet 🙁

So how does this algorithm work (apart from “badly”)? I have a couple of hypotheses:

  • It implements a form of the “give every track a number” algorithm, but the index only remembers a fixed number of tracks numbering a few hundreds (maybe ~1000), and anything it read earlier in the indexing process is discarded.
  • It implements the “give every track a number algorithm”, but the random number generator is heavily biased towards the end of the number range.
  • It’s attempting a “random walk”, skipping a random number of steps forwards or backwards through the list at each play (a bit like the VW algorithm, but bidirectional). If this is correct it’s odd that it has never gone into “positive” territory (artists beginning with A-S), but that could be down to chance and not impossible. The problem is that without a definite bias a random walk tends to stay in the same place, so it’s a very poor way of scanning your music collection.

Otherwise I’m at a loss. It’s not like I have a massive number of songs and could have run into an integer size limit or similar (there are only around 11,000 files, including directories and artwork).

Ultimately it doesn’t matter that much. I can live with it for a while and I can probably resolve the issue by downloading another music player app. However you can’t help feeling that a giant of entertainment technology like Sony should probably manage better.

Regardless of that, it’s an interesting exercise in analysis, and also potentially in design. Having identified some poor models, what constitutes a “good ” random music player? I’ve seen some good concepts around grouping songs by “mood”, or machine learning from previous playlists, and I’ve got an idea forming in my head about an app being more like a radio DJ, looking for “links” between the songs in terms of their artist names, titles or genres. Maybe that’s the next development concept. Watch this space.

Posted in Code & Development, Thoughts on the World | Leave a comment

Why REST Doesn’t Make Life More Rest-full

Really Rest-full (Cuba 2010)
Camera: Canon EOS 7D | Lens: EF-S15-85mm f/3.5-5.6 IS USM | Date: 20-11-2010 15:41 | ISO: 200 | Exp. bias: -1/3 EV | Exp. Time: 1/250s | Aperture: 9.0 | Focal Length: 53.0mm (~85.9mm) | Lens: Canon EF-S 15-85mm f3.5-5.6 IS USM

As I have observed before, IT as a field is highly driven by both fashion and received wisdom, and it can be difficult to challenge the commonly accepted position.

In the current world it is barely more politically acceptable to criticise the currently-dominant model of REST, Javascript and microservices than it is to audibly assess the figure of a female co-worker. I was seriously starting to think that I was in some age-defined Luddite minority of one in not being 100% convinced about the universal goodness of that model, but then I discovered an encouraging article by Pascal Chambon “REST is the new SOAP“, and realised that it’s not just me. I am not alone.

I don’t want to re-create that excellent article, and I recommend it to you, but it is maybe instructive to provide some additional examples of the failings Chambon calls out. I have certainly fallen foul of the quasi-religious belief that REST is somehow “better because it uses the right HTTP verbs”, and that as a result the “right verbs must be used”. On my last contract there was a lengthy argument because someone became convinced I was using the wrong ones. “You’re using POST to do a DELETE. That’s wrong.”

“No, we’re submitting a request to do a delete, if approved. At some later point, after the request has been reviewed and processed, this may or may not result in a low-level delete action, but the API is about the request submission. And anyway, you can’t submit a proper payload with a DELETE.”

“But you’re using a POST to do a DELETE…”

In the end I mollified him slightly by changing the URL of the API so that the tip wasn’t …/host, but …/host/request, but that did feel like the tail wagging the dog.

Generally REST promotes a fairly inflexible CRUD model, and by default without the ability to specify exactly which items are retrieved or updated. In a good design we may need a much richer set of operations. In either an RPC approach (as outlined in Chambon’s article), or a “remote object access” approach, such as one based on SOAP, we can flexibly tailor the operations precisely to the needs of the solution.

Here’s a good example. I need to “rename” an object, effectively changing its primary key. In the REST model, I have to choose one of the following:

  • Add extra fields to the PUT payload to carry the “new” and “old” keys, and write both client- and server-side conditional code around their values, or an additional “operation” value
  • Do a DELETE (with the old key) followed by a POST (with the new one), making sure that all the other data required to recreate the record is passed back for the POST, and write a host of additional code to handle cases like the DELETE succeeding but the POST failing, or the POST being treated as a new item, not just an update (because it’s not a PUT).
  • Have a dedicated endpoint (e.g. …/object/rename) which accepts a POST operation with just the required data for the rename. That would probably be my favourite, but I can hear the REST purists screaming in the wind…

In a SOAP model, I can just have an explicit Rename(oldkey, newkey) operation on a service named for the underlying business object. Simples.

So Is SOAP The Old REST?

I’m comfortable with Chambon’s casting of REST as the supposed handsome hero who turns out to be a useless, treacherous bastard. I’m less comfortable with the casting of SOAP as the pantomime villain (boo hiss).

Now your mileage may vary, and Chambon obviously had some bad experiences, but in my own experience SOAP is a very strong and reliable technology which a lot of the time “just works”. I’ve worked in environments where systems developed in .Net, Oracle, Enterprise Java, a LAMP stack and Python cheerfully exchanged with each other using SOAP, across multiple physical locations, with relatively few complexities and usually just a couple of lines of code to access a full object model with formal schema and policy support.

In contrast, even if you navigate through the various different ways a REST service may work, inter-platform operation is by no means as simple as claimed. In just the past week I wasted about half a day trying to pass a body parameter between a Python client and a REST API presented by .Net. It should have worked. It didn’t. I converted the service to SOAP, and it worked almost first time. (Almost. It would have been even quicker if I’d remembered to RTFM…)

Notwithstanding the laudable attempts to fill the gap for REST, SOAP is still the only integration technology where every service has full machine and human readable documentation built in, and usually in a standard fashion. Get a copy of the WSDL (Web Service Definition Language) either from the service itself, or separately, and you know what it does, with what data, and, where it’s relevant to the client, how.

To extend the theatrical metaphor, in my world SOAP is the elderly retired hero who’s a bit pedantic and old-fashioned, maybe a bit slow on his feet, but actually saves the day.

It’s About the Architecture, Stupid

Ultimately it doesn’t actually matter whether your solution uses REST, SOAP, messages, distributed objects or CSV file transfers. Any can be made to work with sufficient attention to the architecture. All will fail in the presence of common antipatterns such as complex mixed data models, massive functional decomposition to too fine a level, or trying to make high-frequency chatty exchanges over higher-latency links.

Modern technologies attempt to hide a lot of technical complexity behind simple abstraction layers. While that’s an excellent approach overall, it does raise a risk that developers are unaware of how a poor design may cause underlying technical problems which will cause failure. For example while some low-level protocols are more tolerant than others, the naïve expectation that REST will work over any network regardless “because it is based on HTTP” is quite wrong. REST, SOAP and plain old web pages can all make good, efficient use of HTTP. REST, SOAP and plain old web pages will all fail if you insist on a unit of work being composed of vast numbers of separate small exchanges rather than a few larger ones. They will all fail if you insist on transferring large amounts of unfiltered data to the client, when that data should be pre-processed and filtered on the server. They will all fail if you insist on making every low-level exchange a network service when many of these should be direct in-process operations.

Likewise if you have a load of services, whether your own microservices or third party endpoints, and each service defines its own data structure which may be subject to change, and you try and directly consume and produce those proprietary data structures everywhere you need them, you are building yourself a world of pain. A core common data model with adapters for each format will serve you much better in the long run.

So Does Technology Choice Matter?

Ultimately no. For example, I have built an architecture with an underlying canonical data and adapter model but using REST for every exchange we controlled and it worked fine. Also in the real world whatever your primary choice you’ll probably have to deal with all the others as well. That shouldn’t scare you, but I have seen REST-obsessed developers run screaming from the room at the thought of having to use SOAP as well…

However, a good base choice will definitely make things easier. It’s instructive to think about a layered model of the things you have to define in a complex integration:

  • Documentation
  • Functionality
  • Data structure and format
  • Data encoding and transport
  • Policies
  • Service location and routing

SOAP is unique among the options in always providing built-in documentation for the service’s functions, data structures and policies. This is a major omission in the REST world, which is progressively being addressed by the Swagger / OpenAPI initiative and variants, but they will always be optional add-ons with variable coverage rather than a fundamental part of the model. For all other options, documentation is necessarily external to the service itself, and it may or may not be up to date and available to whoever needs it.

Functionality is discussed above and in Chambon’s article. Basically REST maps naturally to CRUD operations, and anything else is a bit of a bodge. SOAP and other RPC or distributed object models provide direct, explicit support for whatever functions are required by the business problem.

SOAP provides built-in definition and documentation of data structures and formatting, using XML Schema which means that the definition is machine and human readable, standardised, and uses namespaces and references to manage, for example, items with the same name but different uses and formats. Complexities such as optionality and alternative structures are readily defined. In addition a payload can be easily verified against the defined schema. Swagger optionally adds similar capabilities to the REST model, although without some discipline it’s easy for the implemented service to differ from the documented one, and it’s less easy to confirm that a given payload conforms. Both approaches focus on syntactic definition with semantic guidance optional and mainly through comments and examples.

In terms of encoding the data, the fashionable approach is JSON. The major benefits are that it’s simple, payloads are a bit smaller than the equivalent XML, and that it’s easy to parse into and generate from equivalent data structures in languages like Python.

However, I’m not a great follower of fashion. XML may be less trendy, but it offers a host of industrial-strength features which may be important in more complex use cases. It’s easy to unambiguously indicate the schema for each document and validate against it. If you have non-ASCII or binary data then their encoding is unambiguously defined. It’s easy to work separately with fragments of a larger document if you need to. Personally I also find XML easier to read and manually edit if I have to, but I accept that’s a bit subjective. One argument is that JSON is easier to render into a HTML page, but I’ve achieved much the same without any procedural code at all using XML with XSLT.

Of course, there’s no real need to have to choose. The best REST APIs I have worked with have the ability to generate equivalent JSON and XML from the same queries, and you choose which works best in a given context. Sadly this is again a bit too much for the REST purists, but a good solution when it works.

Beyond the functional definition of a service and its data, we also have to consider the non-functional behaviours, what are often referred to as “policies” in this context. How is the service secured? What encryption is applied to payloads and headers? What is the SLA, and what action should you take if it is exceeded? Is asynchronous or callback behaviour defined? How do I confirm I have all the required items in a set of exchanges, and what do I do about missing ones? What happens if a service fails, or raises an error?

In the early 2000s, when web services were a new concept, a lot of effort was invested in trying to establish standard ways to define these policies. The result was a set of extensions to SOAP known as the WS-* specifications: a set of rules to enable direct and potentially automated negotiation of all these aspects based on standardised information in the service WSDL and SOAP headers. The problem was that the standards quickly proliferated, and created the risk of making genuinely simple cases more complex than necessary. REST emerged as a simpler alternative, but with a KISS ethic which means ignoring the genuinely complex.

Chambon’s article touched on this in his discussion of error coding, but there are many other similar aspects. REST is a great solution for simple cases, but should not blind the developer to SOAP’s menu of standard, stronger solutions to more difficult problems.

A similar choice applies at the final level, that of locating and connecting service endpoints at runtime. For many cases we simply rely on network infrastructure and services like DNS and load balancing. However when this doesn’t meet more complex requirements then the alternatives are to construct or adopt a complex proprietary solution, or to embrace the extended standards in the WS-* space.

One technology choice is important. A professional modern Integrated Development Environment such as Visual Studio or Intellij Idea will do much of the “heavy lifting” of development, and does make work much quicker and less error-prone. I completely fail to understand why in 2018 some developers are still trying to do everything with vi and a Unix command line. When I was a schoolboy in the 1970s there was a saying “shouldn’t you have handed that in at the end of the war?”, referring to people still using or hoarding equipment issued in WW2. Anyone who is trying to do software development in the late 2010s with the software equivalent deserve what they get… It is a mistake to drive a solution from the constraints of your toolset.

Conclusions

The old chestnut that “to the man who only has a hammer, every problem looks like a nail” is nowhere more true than in software development. We seem to spend a great deal of effort trying to make every new software technique the complete solution to life, the universe, and everything, rather than accepting that it’s just another tool in the toolbox.

REST is a valid addition to the toolbox. Like it’s predecessors it has strengths and weaknesses. It’s a great way to solve a whole class of relatively simple web service requirements, but there are definite boundaries to that capability. When you reach those boundaries, be prepared to embrace some older, less-fashionable but ultimately more capable technologies. A religious approach will fail, whereas one based on an architectural viewpoint and an open assessment of all the valid options has a much greater chance of success.

View featured image in Album
Posted in Agile & Architecture, Code & Development | Leave a comment

An Odd Omission

Let’s start with a common use case…

"I have a television / hi-fi / home cinema system which has several components from different manufacturers. I would like to control all of them with a single remote control. I would like that remote control to be configurable, so that I can decide which functions are prioritised, and so that I can control multiple devices without having to switch "modes". (For example, the primary channel controls should change the TV channel, but at the same time and without changing modes the volume controls should change the amplifier volume.) As not all of my devices are controllable via Wi-Fi, Infrared is the required primary carrier/protocol. The ideal solution would be a remote control with a configurable touch screen, probably about 6" x 3" which would suit one-handed operation."

I can’t believe I’m the first person to articulate such a use case. In fact I know I’m not, for two reasons. When I set up the first iteration of my home cinema system in about 2004, I read a lot of magazines and they said similar things.

And then I managed to buy a dedicated device which actually did this job remarkably well. It was called a Sunwave Universal Remote, and had a programmable LCD touchscreen. It had the ability to choose which device functions appeared where, and to record commands from existing remotes or define macros (sequences of commands). This provided some, limited, "mixed device" capability, although the primary approach was modal (select the target device, and then use controls for that device). A set of batteries lasted about a year.

There were only two problems. First, as successive TVs became smarter than in 2004 it became an increasing challenge to find appropriate buttons for all the functions from within the fixed option list. Then, after 13 or so years of sterling service the LCD started to die. I still own the control, but it’s now effectively unusable.

My first approach was to try and get a direct replacement. However it’s clear that these devices haven’t been manufactured for years. The few similar items on eBay are either later poor copies, with very limited functionality, or high-end solutions based on old PDAs at ridiculous prices.

But hang on. "a configurable touch screen, probably about 6" x 3"". Didn’t I see such a device quite recently? I think someone was using one to make a phone call, or surf the internet, or check Facebook, or play Angry Birds, or some such. In fact we all use smartphones for much of our technology interaction, so why not this use case?

Achtung! Rabbit hole! Dive! Dive! 🙂

Why not, indeed? Actually I knew it was theoretically possible, because my old Samsung 10" tablet which was about to go on eBay had some software called "Peel Remote" installed as standard, and I’d played with controlling hotel TVs with it. I rescued it from the eBay pile and had an experiment. The first discovery was that while there’s a lot of "universal remote" software on Google Play, most is rubbish, either with very limited functionality, or crippled by stupid amounts of highly-invasive advertising. There are a few honourable exceptions, and after a couple of false starts I settled on AnyMote developed by Color Tiger. This has good "lookup" support to get you started, a nice editing function within the app, and decent ways to backup and share remote definitions between devices. A bit of fiddling got me set up with a screen which controlled our system much better than before, and it got us through all our Christmas watching.

However picking up a 10" tablet and turning it on every time you want to pause a video is a bit clumsy, so back to the idea of using a phone…

And here’s the problem. Most phones have no infrared support. While I haven’t done any sort of scientific analysis, I’d guess that 70-80% (by model) just don’t have what’s known as an "infrared blaster", the element which actually emits the infrared signals. Given that this is very simple technology, not much more than an infrared LED in the phone’s top edge, it’s an odd omission. We build devices stuffed with every sort of wireless and radio interface, but omit this common one used by much of our other technology.

Fortunately it’s not universal, and there are some viable options. A bit of googling suggested that the LG G2 does have an IR blaster, and I tracked down one for about £50 on eBay. It turns up, the software installs…, and it just doesn’t work. That’s when I find the next problem: several of the phone manufacturers who make both TVs and phones (LG and Sony are the most obvious offenders) lock down their IR capabilities, so they are not accessible to third party software. You can use your LG phone to control your LG TV, but that’s it, and f*** all use to me.

Back on Google and eBay. The HTC One M7 and M8 do have IR and do seem to support third-party software. The M8 is a bit bigger, probably better for my use case, and there’s one on eBay in nice condition for a good price. It turns up, the software installs…, and then refuses to run properly. It can’t access the IR blaster. Back on Google and confirm the next problem. Most phones which have been upgraded from Android 5 or earlier to Android 6 have a changed software interface to the infrared which doesn’t work for a lot of third-party software. Thanks a billon, Google. 🙁

OK, last roll of the dice. The HTC One M7 still runs Android 5. I find a nice blue one, a bit more money than the M8 ironically, but still within budget. It turns up, the software installs…, and it works! I have to do a few minor adjustments on the settings copied from my tablet, but otherwise straightforward. I had to install some software to make the phone turn on automatically when it’s picked up, and I may still have to do a bit of fiddling to optimise battery life, but for now it’s looking good…

Third time lucky, but it really didn’t have to be that difficult. For reasons which are impossible to fathom, both Google and most phone manufacturers seem to somewhere between ignoring and actively obstructing this valid and common use case. Ironically, given their usual insularity, things are a bit easier in the Apple world, with good support for third party IR blasters which plug into an iPhone’s headphone socket, but that wouldn’t be a good solution given the rest of my tech portfolio. For now I have a solution, but I’m not impressed.

Posted in Android, Thoughts on the World | Leave a comment

How Strong Is Your Programming Language?

Line-up at the 2013 Europe's Strongest Man competition
Camera: Canon EOS 7D | Date: 29-06-2013 05:31 | Resolution: 5184 x 3456 | ISO: 200 | Exp. bias: -1/3 EV | Exp. Time: 1/160s | Aperture: 13.0 | Focal Length: 70.0mm (~113.4mm)

I write this with slight trepidation as I don’t want to provoke a "religious" discussion. I would appreciate comments focused on the engineering issues I have highlighted.

I’m in the middle of learning some new programming tools and languages, and my observations are coalescing around a metric which I haven’t seen assessed elsewhere. I’m going to call this "strength", as in "steel is strong", defined as the extent to which a programming language and its standard tooling avoid wasted effort and prevent errors. Essentially, "how hard is it to break?". This is not about the "power" or "reach" of a language, or its performance, although typically these correlate quite well with "strength". Neither does it include other considerations such as portability, tool cost or ease of deployment, which might be important in a specific choice. This is about the extent to which avoidable mistakes are actively avoided, thereby promoting developer productivity and low error rates.

I freely acknowledge that most languages have their place, and that it is perfectly possible to write good, solid code with a "weaker" language, as measured by this metric. It’s just harder than it has to be, especially if you are free to choose a stronger one.

I have identified the following factors which contribute to the strength of a language:

1. Explicit variable and type declaration

Together with case sensitivity issues, this is the primary cause of "silly" errors. If I start with a variable called FieldStrength and then accidentally refer to FeildStrength, and this can get through the editing and compile processes and throw a runtime error because I’m trying to use an undefined value then then programming "language" doesn’t deserve the label. In a strong language, this will be immediately questioned at edit time, because each variable must be explicitly defined, with a meaningful and clear type. Named types are better than those assigned by, for example, using multiple different types of brackets in the declaration.

2 Strong typing and early binding

Each variable’s type should be used by the editor to only allow code which invokes valid operations. To maximise the value of this the language and tooling should promote strong, "early bound" types in favour of weaker generic types: VehicleData not object or var. Generic objects and late binding have their place, in specific cases where code must handle incoming values whose type is not known until runtime, but the editor and language standards should then promote the practice of converting these to a strong type at the earliest practical opportunity.

Alongside this, the majority of type conversions should be explicit in code. Those which are always "safe" (e.g. from an integer to a floating point value, or from a strong type to a generic object) may be implicit, but all others should be spelt out in code with the ability to trap errors if they occur.

3. Intelligent case insensitivity

As noted above, this is a primary cause of "silly" errors. The worst case is a language which allows unintentional case errors at edit time and through deployment, and then throws runtime errors when things don’t match. Such a language isn’t worth the name. Best case is a language where the developer can choose meaningful capitalisation for clarity when defining methods and data structures, and the tools automatically correct any minor case issues as the developer references them, but if the items are accessed via a mechanism which cannot be corrected (e.g. via a text string passed from external sources), that’s case insensitive. In this best case the editor and compiler will reject any two definitions with overlapping scope which differ only in case, and require a stronger differentiation.

Somewhere between these extremes a language may be case sensitive but require explicit variable and method declaration and flag any mismatches at edit time. That’s weaker, as it becomes possible to have overlapping identifiers and accidentally invoke the wrong one, but it’s better than nothing.

4. Lack of "cruft", and elimination of "ambiguous cruft"

By "cruft", I mean all those language elements which are not strictly necessary for a human reader or an intelligent compiler/interpreter to unambiguously understand the code’s intent, but which the language’s syntax requires. They increase the programmer’s work, and each extra element introduces another opportunity for errors. Semicolons at the ends of statements, brackets everywhere and multiply repeated type names are good (or should that be bad?) examples. If I forget the semicolon but the statement fits on one line and otherwise makes syntactic sense then then code should work without it, or the tooling should insert it automatically.

However, the worse issue is what I have termed "ambiguous cruft", where it’s relatively easy to make an error in this stuff which takes time to track down and correct. My personal bête noire is the chain of multiple closing curly brackets at the end of a complex C-like code block or JSON file, where it’s very easy to mis-count and end up with the wrong nesting.  Contrast this with the explicit End XXX statements of VB.Net or name-matched closing tags of XML. Another example is where an identifier may or may not be followed by a pair of empty parentheses, but the two cases have different meanings: another error waiting to occur.

5. Automated dependency checking

Not a lot to say about this one. The compile/deploy stage should not allow through any code without all its dependencies being identified and appropriately handled. It just beggars belief that in 2017 we still have substantial volumes of work in environments which don’t guarantee this.

6. Edit and continue debugging

Single-stepping code is still one of the most powerful ways to check that it actually does what you intend, or to track down more complex errors. What is annoying is when this process indicates the error, but it requires a lengthy stop/edit/recompile/retest cycle to fix a minor problem, or when even a small exception causes the entire debug session to terminate. Best practice, although rare, is "edit and continue" support which allows code to be changed during a debug session. Worst case is where there’s no effective single-step debug support.

 

Some Assessments

Having defined the metric, here’s an attempt to assess some languages I know using it.

It will come as no surprise to those who know me that I give VB.Net a rating of Very Strong. It scores almost 100% on all the factors above, in particular being one of very few languages to express the outlined best practice approach to case sensitivity . Although fans of more "symbolic" languages derived from C may not like the way things are spelled out in words, the number of "tokens" required to achieve things is very low, with minimal "cruft". For example, creating a variable as a new instance of a specific type takes exactly 5 tokens in VB.Net, including explicit scope control if required and with the type name (often the longest token) used once. The same takes at least 6 tokens plus a semicolon in Java or C#, with the type name repeated at least once. As noted above, elements like code block ends are clear and specific removing a common cause of  silly errors.

Is VB.Net perfect? No. For example if I had a free hand I would be tempted to make the declaration of variables for collections or similar automatically create a new instance of the appropriate type rather than requiring explicit initiation, as this is a common source of errors (albeit well flagged by the editor and easily fixed). It allows some implicit type conversions which can cause problems, albeit rarely. However it’s pretty "bomb proof". I acknowledge there may be some cause and effect interplay going on here: it’s my language of choice because I’m sensitive to these issues, but I’m sensitive to these issues because the language I know best does them well and I miss that when working in other contexts.

It’s worth noting that these strengths relate to the language and are not restricted to expensive tools from "Big bad Microsoft". For example the same statements can be made for the excellent VB-based B4X Suite from tiny Israeli software house Anywhere Software, which uses Java as a runtime, executes on almost any platform, and includes remarkable edit and continue features for software which is being developed on PC but running on a mobile device.

I would rate Java and C# slightly lower as Pretty Strong. As fully compiled, strongly typed languages many potential error sources are caught at compile time if not earlier. However, the case-sensitivity and the reliance on additional, arguably redundant "punctuation" are both common sources of errors, as noted above. Tool support is also maybe a notch down: for example while the VB.Net editor can automatically correct minor errors such as the case of an identifier or missing parentheses, the C# editor either can’t do this, or it’s turned off and well hidden. On a positive note, both languages enforce slightly more rigor on type conversions. Score 4.5 out of 6?

Strongly-typed interpreted languages such as Python get a Moderate rating. The big issue is that the combination of implicit variable declaration and case sensitivity allow through far too many "silly" errors which cause runtime failures. "Cruft" is minimal, but the reliance on punctuation variations to distinguish the declaration and use of different collection types can be tricky. The use of indentation levels to distinguish code blocks is clear and reasonably unambiguous, but can be vulnerable to editors invisibly changing whitespace (e.g. converting tabs to spaces). On a positive note the better editors make good use of the strong typing to help the developer navigate and use the class structure. I also like the strong separation of concerns in the Django/Jinja development model, which echoes that of ASP.Net or Java Server Faces. I haven’t yet found an environment which offers edit and continue debugging, or graceful handling of runtime exceptions, but my investigations continue. Score 2.5 out of 6?

Weakly-typed scripting languages such as JavaScript or PHP are Weak, and in my experience highly error prone, offering almost none of the protections of a strong language as outlined above. While I am fully aware that like King Canute, I am powerless to stop the incoming tide of these languages, I would like to hope that maybe a few of those who promote their use might read this article, and take a minute to consider the possible benefits of a stronger choice.

 

Final Thoughts

There’s a lot of fashion in development, but like massive platforms and enormous flares, not all fashions are sensible ones… We need a return to treating development as an engineering discipline, and part of that may be choosing languages and tools which actively help us to avoid mistakes. I hope this concept of a "strength" metric might help promote such thinking.

View featured image in Album
Posted in Agile & Architecture, Code & Development | Leave a comment

Why I (Still) Do Programming

It’s an oddity that although I sell most of my time as a senior software architect, and can also afford to purchase software I need, I still spend a lot of time programming, writing code. Twenty-five years ago people a little older than I was then frequently told me “I stopped writing code a long time ago, you will probably be the same”, but it’s just turned out to be completely untrue. It’s not even that I only do it for a hobby or personal projects, I work some hands-on development into the majority of my professional engagements. Why?

At the risk of mis-quoting the Bible, the answer is legion, for they are many…

To get the functionality I want

I have always been a believer in getting computers to automate repetitive actions, something they are supremely good at. At the same time I have a very low patience threshold for undertaking repetitive tasks myself. If I can find an existing software solution great, but if not I will seriously consider writing one, or at the very least the “scaffolding” to integrate available tools into a smooth process. What often happens is I find a partial solution first, but as I get tired of working around its limitations I get to the point where I say “to hell with this, I’ll write my own”. This is more commonly a justification for personal projects, but there have been cases where I have filled gaps in client projects on this basis.

Related to this, if I need to quickly get a result in a complex calculation or piece of data processing, I’m happy to jump into a suitable macro language (or just VB) to get it, even for a single execution. Computers are faster than people, as long as it doesn’t take too long to set the process up.

To explore complex problems

While I am a great believer in the value of analysis and modelling, I acknowledge that words and diagrams have their limits in the case of the most complicated problem domains, and may be fundamentally difficult to formulate and communicate for complex and chaotic problem domains (using all these terms in their formal sense, and as they are used in the Cynefin framework, see here).

Even a low-functionality prototype may do more to elicit an understanding of a complex requirement than a lot of words and pictures: that’s one reason why agile methods have become so popular. The challenge is to strike a balance, and make sure that an analytical understanding does genuinely emerge, rather than just being buried in the code and my head. That’s why I am always keen to generate genuine models and documentation off the back of any such prototype.

The other case in which I may jump into code is if the dynamic behaviour of a system or process is difficult to model, and a simulation may be a valid way of exploring it. This may just be the implementation of a mathematical model, for example a Monte Carlo simulation, but I have also found myself building dynamic visual models of complex interactions.

To prove my ideas

Part of the value I bring to professional engagements is experience or knowledge of a range of architectural solutions, and the willingness to invoke unusual approaches if I think they are a good fit to a challenge. However it’s not unusual to find that other architects or developers are resistant to less traditional approaches, or those outside their comfort zones. Models and PowerPoint can go only so far in such situations, and a working proof of concept can be a very persuasive tool. Conversely, if I find that it isn’t as easy or as effective as I’d hoped, then “prove” takes on its older meaning of “test” and I may be the one being persuaded. I’m a scientist, so that’s fine too.

To prove or assess a technology

Related to the last, I have found by hard-won experience that vendors consistently overstate the capabilities of their solutions, and a quick proof of concept can be very powerful in confirming or refuting a proposed solution, establishing its limitations or narrowing down options.

A variant on this is where I need to measure myself, or others, for example to calibrate what might or might not be adequate productivity in a given situation.

To prove I can

While I am sceptical of overstated claims, I am equally suspicious if I think something should be achievable, and someone else says “that’s not possible”. Many projects both professional and personal have started from the assertion that “X is impossible”, and my disbelief in that. I get a great kick from bending technology to my will. To quote Deep Purple’s famously filthy song, Knocking At Your Back Door, itself a exploration into the limits of possibility (with censorship), “It’s not the kill, it’s the thrill of the chase.”.

In the modern world of agile development processes, architect and analyst roles are becoming blurred with that of “developer”. I have always straddled that boundary, and proving my development abilities my help my credibility with development teams, allowing me to engage at a lower level of detail when necessary. My ability to program makes me a better architect, at the same time as architecture knowledge makes me a better programmer.

To make money?

Maybe. If a development activity can help to sell my skills, or advance a client’s project, then it’s just part of my professional service offering, and on the same commercial basis as the rest. That’s great, especially if I can charge a rate commensurate with the bundle of skills, not just coding. My output may be part of the overall product or solution or a enduring utility, but more often any development I do is merely the means to an end which is a design, proof of concept, or measurement.

On the other hand, quite a lot of what I do makes little or no money. The stuff I build for my own purposes costs me little, but has a substantial opportunity cost if I could use the time another way, and I will usually buy a commercial solution if one exists. The total income from all my app and plugin development over the years has been a few hundred pounds, probably less than I’ve paid out for related tools and components. This is a “hobby with benefits”, not an income stream.

Because I enjoy it

This is perhaps the nub of the case: programming is something I enjoy doing. It’s a creative act, and puts my mind into a state I enjoy, solving problems, mastering technologies and creating an artefact of value from (usually) a blank sheet. It’s good mental exercise, and like any skill, if you want to retain it you have to keep in practice. The challenge is to do it in the right cases and at the right times, and remember that sometimes I really should be doing something else!

Posted in Agile & Architecture, Code & Development, Thoughts on the World | Leave a comment

Dozy Android

I’ve just spent a good couple of hours sorting out a problem with my new phone, which has no good reason to exist. In fairness to Sony, it’s nothing to do with them: the issue sits squarely with Google and yet another "improvement" to Android which turns out to be nothing of the sort.

A watch-based alarm doesn’t work very well for me – my hearing is just not good enough. Seeking to reduce the amount of gadgets I carry, I have therefore for many years relied on phones and their PDA predecessors to fulfil the function of alarm clock, especially when I’m travelling. It’s not a difficult role, and I have not had to complain about it. Until now.

In my normal weekly cycle I don’t have much need for a clock as I wake naturally at about the right time each day. This makes the operation of such a function even more critical, as it has to be absolutely reliable on days which are exceptions, and I don’t get much opportunity to do much advance "testing" of what I assume is something that should "just work". However, I do have the alarm set every day when I’m working away from the home, and although I couldn’t be absolutely sure I was coming to suspect that it wasn’t going off at the right time. The first couple of times I assumed "user error": incorrect settings, volume too low etc., but I had eventually eliminated those, and confirmed the behaviour: the alarm didn’t go off at the programmed time. It went off after I had woken up and clicked the button to wake up my phone’s screen.

This is about as useful as a chocolate fireguard, and about as welcome as a fart in a spacesuit.

A bit of Googling confirmed that the problem is quite widespread. I’ve read stories of people with new phones being late for work or missing important appointments. Others describe a similar problem with other programs including not getting notified promptly of night-time messages or similar: potentially quite a problem for those "on call". Fortunately I caught the problem before it caused me any trouble, but that might not have been the case, as I have an upcoming trip with about 8 flights and several other dawn starts.

The web is full of useless "solutions" like factory resetting the phone, but after eliminating those, I tracked down the cause of the problem. With Android 6 ("Marshmallow"), Google introduced something called "Doze" mode. This is a deep sleep mode which kicks in if the device is at rest, screen off, and no significant ongoing activity like an active data transfer. You know, like it tends to be at night. In this state, the system not only slows down processing, but also suspends the bulk of normal background activity. This includes, for no articulated good reason, suspending timers and related event triggers. So your alarm application doesn’t know what time it is, and doesn’t fire. Your messaging app doesn’t know when to poll for incoming events. Simple, core functions of your smartphone just cease to work.

Allegedly, if you change the code of your alarm or other app to use a "different kind" of timer, that should work, but after testing four or five I concluded that this is just not true, certainly on my phone. In any case, I usually just use the stock Android "clock" app, and surely they would have remembered to update that, wouldn’t they? You can also nominally turn off Doze for selected applications, but as far as I can see it makes bugger all difference.

It turns out that the root problem is that in at least some Android 6 implementations, Doze mode actually disables the underlying operating system events on which the other timers are based. It doesn’t matter how sexy your alarm app is, or whether Doze knows about it or not, if the underlying timers are blocked!

There’s a heap of advice on the web about how to disable Doze for individual apps (tried that, doesn’t work), but not about how to disable it completely. I’d tried all sorts of settings without success. However I finally found a useful little app called Disable Doze, which does what it says on the tin, and turns Doze off completely. Allegedly (according to Google) this would result in my phone discharging its battery at a terrifying  rate and ending up doing a Galaxy Note 7 impersonation, but I can confirm that with Doze off in light use my phone is still only consuming about 10% battery per day. The only noticeable effect so far is that alarms and notifications work again.

My worry is that until Google acknowledge their mistakes, they may come up with another "improvement" which disables this fix. I don’t know what tests Google perform in this area, but they are clearly inadequate. This really is a "0 out of 10" effort, a true "breaking change".

However for now things are looking good, and hopefully this blog will help alert others to the problem and the fix.

Posted in Android, Thoughts on the World | Leave a comment

Conversion Challenges

I have an interesting challenge, as one of the projects I am working on want to stop their environments to save costs, but I need ongoing access to the data. I have a dump from an Oracle database, but I need to convert to SQL/Server which is much more portable. The solution looks like an excellent little product from "Intelligent Convertors", who have a whole suite of these tools. I’ll try it and let you know how I get on.

Intelligent Converters - software to convert MS Access, DBF and Oracle databases to MySQL and vice versa, PDF to Word, PDF to HTML, PDF to text

Posted in Code & Development | Leave a comment

Platform Flexibility – It’s Alive!

The last post, written largely back in November and published just before Christmas suggested that camera manufacturers should focus on opening up their products as development platforms, much as has happened with mobile phones. While I can’t yet report on this happening for cameras, I now have direct experience of exactly this approach in another consumer electronics area.

I decided to replace a large picture frame in my office with a electronic display, on which I could see a rolling presentation of my own images. This is not a new idea, but decreasing prices and improving specs brought into my budget the option of a 40"+ 4K TV, which on the experience of our main TV should be an excellent solution.

New Year’s Eve brought a trip to Richer Sounds in Guildford. As usual the staff were very helpful and we quickly narrowed down the options to equivalent models from Panasonic or Sony. The Panasonic option was essentially just a smaller version of our main TV, but the colours were slightly "off" and we preferred the picture quality of the Sony. The Panasonic’s slideshow application is OK, but limited, but the Sony’s built-app looked downright crude. It looked like a difficult choice, but then I realised that the Sony operating system is something called "AndroidTV" with Google Play support, and promised the option of a more open platform, maybe even development myself. Sold!

In practice, it’s exactly as I expected. The basic hardware is good, but the Sony’s default applications beyond the core TV are a bit crude. However a bit of browsing on Google Play revealed a couple of options, and I eventually settled on Kodi, a good open-source media player, which does about 90% of what I want for the slideshow. Getting it running was a bit fiddly, not least because a key picture-handling setting has to be set by uploading a small XML file rather than via the app’s UI, but after only a bit of juggling it’s now running well and doing most of what I want.

Beyond that, I can either develop an add-on for Kodi, or a native application for AndroidTV. However as the existing developer community has provided a 90% solution, I’m not in a great hurry.

I call that a result for platform vs product…

Posted in Agile & Architecture, Android, Code & Development, Photography, Thoughts on the World | Leave a comment

Do We Want Product Development, or Platform Flexibility?

There’s been a bit of noise recently in the photography blogosphere relating to how easy it is to make changes to camera software, and why, as a result, it feels like camera manufacturers are flat out not interested in the feature ideas of their professional and more capable enthusiast users. It probably started with this article by Ming Thein, and this rebuttal by Kirk Tuck, followed by this one  and this one by Andrew Molitor.

The problem is that my "colleagues" (I’m not quite sure what the correct collective term is here) are wrong. For different reasons. They are all thinking of the camera as a unitary product, and none of them (even Molitor, who claims to have some experience as a system architect) are thinking as they should, of the camera as a platform.

OK, one at a time, please…

There are a lot of good ideas in Ming Thein’s article. A lot of his suggestions to improve current mirrorless cameras are good ones with which I agree. The trouble is that he is trying to design "Ming Thein’s perfect camera", and I suspect that it wouldn’t be mine. For a start it would end up far too heavy, too expensive and with too many knobs!

Kirk Tuck gets, this, and his article is a sensible exploration of trade-offs and how one photographer’s ideal may be another’s nightmare. However he paints a picture of flat-lining development which is very concerning, because there are some significant deficiencies in current mainstream cameras which it would be great to address.

Andrew Molitor then picks up this strand, and tries to explain why all camera feature development is difficult, and prohibitively expensive, and why Expose to the Right (ETTR) is especially difficult. Set aside that referring to Michael Reichmann as "a pundit" is unkind and a considerable underestimation of that eminent photographer’s capabilities, there are several fallacies in Molitor’s articles. Firstly, it just would not be as difficult as claimed to implement ETTR metering, or any variant of it. It’s just another metering calculation. If you have a camera with some form of live histogram or overexposure warning, then you can already operate this semi-manually, tweaking down the exposure compensation until the level of clipping is what you want. If you can do it via a predictable process, then that enormously powerful computer you call a digital camera can easily be made to replicate the same quickly and efficiently. That’s what the metering system does. It’s even quite likely that the engineers have already done something similar, but hidden it. (Hint: if you have a scene mode called something like "candle-lit interior", you’re almost there…)

I suspect the calculations of grossed-up cost are also fallacious. If that were the case, in a market which manages US sales of only a few tens of thousands of mirrorless cameras per year (for example), we would never get any new features at all. The twin realities are that by combining multiple features into the normal streams of product or major release development, many of the extra costs are amortised, but we also know that the big Japanese electronics companies apply different accounting standards to development of their flagship products. If Molitor’s argument was correct, we would not see features in each new camera such as a scene mode for  "baby’s bottom on pink rug" (OK, I made that one up :)) or in-camera HDR, and things like that don’t seem to be a problem. I simply cannot believe that "baby’s bottom on pink rug" will generate millions of extra dollars revenue, compared with a "control highlight clipping" advanced metering mode, which would be widely celebrated by almost all equipment reviewers and advanced users.

So assuming that I’m right, and on-going feature development is both feasible and desirable, where does that leave us?

Ming Thein is not alone in expressing disappointment with the provision of improved features focused for the advanced photographer, and I agree with him that the slow progress is really very annoying. In my most recent review, I identified several relatively simple features which would be of significant value to the advanced photographer, and which could easily be implemented in the software of any good mirrorless camera without hardware changes, including:

  1. Expose to the right or other "automatically control highlight clipping" metering
  2. Optimisation for RAW Capture (e.g. histogram from RAW, not JPG)
  3. Proper RAW-based support for HDR, panoramas, focus stacking and other multishot techniques
  4. Focal distance read-out and hyperfocal focus
  5. Note taking and other content enrichment

All of these have been identified requirements/opportunities since the early era of digital photography. Many of them are successfully implemented in a few, perhaps more unusual models. For example the Phase One cameras implement a lot of the focus-related features, the Olympus OM-D E5-II does a form of image stacking for resolution enhancement, and Panasonic have just introduced a very clever implementation of focus bracketing in the GX8 based on a short 4K burst. However by and large the mainstream manufacturers have not made any significant progress towards them.  Even if Molitor’s analysis is correct, and this is all much more difficult than I expect (despite my strong software development experience) you would think that over time there would be at least some perhaps limited visible progress, but no. If the concepts were really "on the product backlog" (to use the iterative development term), then some would by now have "made the cut", but instead we get yet more features for registering babies’ faces…

My guess is that some combination of the following is going on:

  • The "advanced photographer" market is relatively small, and quite saturated. Camera manufacturers are therefore trying to make their mid-range products attractive to users who would previously have bought a cheaper device, and who may well consider just using a phone as an option. To do this, the device needs to offer lots of "ease of use" features.
  • Marketing and product management groups are focused on the output of "focus groups", which inevitably generate lowest-common denominator requirements which look a lot like current capabilities.
  • Manufacturers are fixated on a particular set of use cases and can’t conceive that anyone would use their products in a different way.

The trouble is that this leaves the more experienced photographers very frustrated. The answer is flexibility. By all means offer an in-camera, JPG-only HDR for the novice user, but don’t fob me off with it – offer me flexible RAW-based multishot support as well. Re-assignable buttons are a good step in the right direction, but they are not where flexibility begins and ends. The challenge, of course, is to find a way to provide this within fixed product cycles and limited budgets.

I think the answer lies with software architecture, and in particular how we view the digital camera. It’s time for us all, manufacturers and advanced users alike, to stop thinking of the camera as a "product", and start thinking of it as a "platform", for more open development. In this model the manufacturer still sells the hardware, complete with basic functionality. Others extend the platform, with "add-ins" or "apps", which exploit the hardware by providing new ways to drive and exploit its capabilities.

We’ve been here before. In the early noughties, mobile phone hardware had evolved beyond all recognition (my first mobile phone was a Vodafone prototype which filled one seat and the boot of my Golf GTI, and needed a six-foot whip antenna!) However, you bought your phone from Nokia, for example, and it did what it did. If you didn’t like the contact management functionality, you were stuck with it.

Then Microsoft, followed more visibly by Apple and eventually Google, broke this model, by delivering a platform, a device which made phone calls, sure, but which also supported a development ecosystem so that some people could develop "apps", and others could install and use those which met their needs. Contact management functionality is now limited only by the imagination of the developer community. Despite my criticism of some early attempts, the model is now pretty much universal, and I don’t think I could go back to a model where my phone was a locked-down, single-purpose device.

The digital camera needs to go the same way, and quickly before it is over-run by the phone coming at the same challenge from the other side. Camera manufacturers need to stop thinking about "what other features should we develop for the next camera", and instead direct themselves to two questions, one familiar and one not. The familiar one is, of course, "how can we make the hardware even better"? The unfamiliar one is "how can we open up this platform so that developers can exploit it, and deliver all that stuff the advanced users keep going on about"?

Ironically, for many manufacturers many of the concepts are in place, just not joined up. The big manufacturers all offer open lens mounts, so that anyone can develop lenses for their bodies. In the case of Panasonic, Olympus and the other micro-four thirds partners it’s even an open multi-party standard. Panasonic certainly now deliver "platform" televisions with the concept of third party apps. There’s a healthy community of "hackers" developing modified firmware for Canon and Panasonic cameras, albeit at arms length from and with a slightly ambivalent relationship to the manufacturers. I’m sure many of those would very much prefer to be working as partners, within an open development model.

So what should such a "platform for extensibility" look like? Assuming we have a high-end mirrorless camera (something broadly equivalent to a Panasonic GX8) to work with as base platform, here are some ideas:

  1. A software development kit, API and "app store" or similar for the development and delivery of in-camera "apps". For example, it should be possible to develop an ETTR metering module, which the user can choose as an optional metering mode (instead of standard matrix metering). This would be activated in place of the standard metering routine, take in current exposure, and return required exposure settings and perhaps some correction metadata. Obviously the manufacturer would have to make sure that any such module returned "safe" values, but in a mirrorless camera it should be very easy to check that the exposure settings are "reasonable" and revert to a default if not. Other add-ins could tap into events such as the completion of an exposure, or could activate functions such as setting focal distance. The API should either be development language-agnostic, or should support a well-known language such as Java, C++ or VB. That would also make it easier to develop an IDE (exploiting Visual Studio or Eclipse as a base), emulators and the like. There’s no reason why the camera needs an "open" operating system.
  2. An SDK for phone apps. This might be an even easier starting point, albeit with limitations. Currently manufacturers such as Panasonic provide some extended functions (e.g. geotagging) via a companion app for the user’s phone, but these apps are "closed", and if they don’t do what you want, that’s an end of it. It would be very easy for these manufacturers to open up this API, by providing libraries which other developers can access. My note taking concept could easily be delivered this way. The beauty of this approach is that it has few or no security issues for the camera, and the application management infrastructure is delivered by Google, Apple and Microsoft.
  3. An open way to share, extend and move metadata. Panasonic support some content enrichment, but in an absolutely nonsensical way, as those features only work for JPEG files. What Panasonic appear to be doing is writing to the JPEG EXIF data, but not even copying to the RAW files. The right solution is support for XMP companion files. These can then accompany the RAW file through the development process, being progressively enhanced by different tools, and relevant data will be permanently written to the output JPEG. This doesn’t have to be restricted to static, human-readable information. If, for example, the ETTR metering module can record the difference between its exposure and the one set by the default matrix method, then this can be used by the RAW processing to automatically "normalise" back to standard exposure during processing. XMP files have the great advantages that they are already an open standard, designed to be extensible and shared between multiple applications, and it’s pretty trivial to write code to manipulate them, so this route would be much better than opening up the proprietary EXIF metadata structures.
  4. A controllable camera. What I mean by this is that the features of the camera which might be within the scope of the new "apps" must be set via buttons, menus and "continuous" controls (e.g. wheels with no specific set positions), so that they can be over-ridden or adjusted by software. They must not be set by fixed manual switches, which may or may not be set where the software requires. The Nikon DF or the Fuji XT1 may suit the working style of some photographers – that’s fine – but they are unsuited to the more flexible software environment I’m envisaging. While I prefer the ergonomics of "soft" controls, in this instance they are also a solution which promotes flexibility, which is what we’re seeking to achieve here.

This doesn’t have to be done in one fell swoop, and it might not be achieved (or even appropriate) 100% for every camera. That’s fine. Panasonic, for example, could make a great start by opening up the "Image App" library, which wouldn’t require any immediate changes to the cameras at all.

So how about it?

Posted in Agile & Architecture, Code & Development, Photography, Thoughts on the World | Leave a comment

SharePoint: Simply C%@p, or Really Complicated C%@p?

There’s a common requirement for professional users of online document management systems. Sometimes you want to have access to a subset of files offline, with the ability to upload changes when you have finished work and are connected again. Genuine professional document management solutions like Open Text LiveLink have been able to do this for years, frequently with a little desktop add-in which presents part of the document library as a pseudo-drive in Windows Explorer.

Microsoft SharePoint can’t do this. It has never been able to do this, and it still can’t. Microsoft have worked out that it’s a requirement, they just seem completely incapable of implementing a usable solution to achieve it, despite the fact that doing so would instantly bridge a significant gap between their online DM solution and their desktop products.

For the first 10 years, they had no solution at all. Then Office 2010 introduced "Microsoft SharePoint Workspace 2010". This promises, but under-delivers. It can cache all documents in a site into a hidden folder on your PC, and allows access to them through an application which looks a little bit like Windows Explorer, but isn’t. It’s very fiddly, and breaks all the rules about how you expect Office apps to work. It’s also slow and unreliable. Google it, and you find bloggers who usually praise Microsoft products to the skies using words like "excrable". Despite at least three office releases since 2010, Microsoft don’t appear to have made any attempt to fix it.

There’s now an alternative option, in the form of OneDrive for Business. This has a different balance of behaviours. On the upside, you can control where it syncs files so that they do appear in Explorer in a controlled fashion. On the downside, you can only link to a single SharePoint site (not much use if you have a client with multiple sites for different groups), and it still insists on synching all files in bulk, which is not what you want at all. On top of that I couldn’t get it to authenticate reliably, and was seeing a lot of failed synchronisations leaving my copy in an indeterminate state. There’s supposed to be a major rewrite in progress, bringing it more inline with the personal version of OneDrive, which works quite well, but no sign of anything useful yet…

Having wasted enough time on a Microsoft-only solution, I reverted to a solution which does work fairly well, using the excellent Syncback Pro. You have to log in using  Internet Explorer and the "keep me signed in" setting before it will work, but after that it delivers exactly what I want, allowing the selection of an exact subset of files, and the location of the copy on your PC, with intelligent two-way synchronisation. Perfect.

Perfect? Well, sort of. Syncback works very well, but even it can’t work around some fundamental limitations of SharePoint. The biggest problem is that when SharePoint ingests a file, at resets both the file modified date and the file created date to be the date and time of ingestion! When you export or check the file, it therefore appears to be a changed, later version than the one you uploaded. Proper professional DM systems just don’t do this, and the Syncback guys haven’t found a solution. Worse, I discovered that SharePoint process was marking some files as checked in, and therefore visible to other users, and some as still checked out to me, and therefore invisible to others.

The latter is a real problem, since the point of uploading the files is to share them with others. It’s also very fiddly to fix as SharePoint doesn’t seem to provide any list of files checked out, and there’s no mechanism to check files in in bulk – you have to click on each file individually and go through the manual check-in process.

Aha, I thought. Surely Microsoft’s excellent development tools will allow me to quickly knock up a little utility to search through a site, find the files checked out to me, and programmatically check them in. Unfortunately not. the first red flag was the fact that on a PC with full installations of Office and a couple of versions of Visual Studio, there’s no installed object model for SharePoint. After a lot of Googling I found a download called the "Office Developer Tools for VS 2013". I didn’t think I needed this, given what I already had installed, but ran the installer anyway. This took longer to complete than a full installation of Office or Visual Studio would, and in the process silently closed all my open office apps, losing some work. When it finished I still couldn’t see the SharePoint objects immediately, but adding a couple of references to my project manually finally worked. Right up to the point where I tried to test run the project, at which point the execution failed on the first line. It appears that these objects are designed to only support development but the code must execute on a server running SharePoint – there’s no concept of developing a desktop tool remotely interrogating a library.

OK, I thought. What about web services? I remember in the early days of SharePoint I was able to use SOAP web services to access and interrogate it, and I thought the same should still be true. To cut a long story short, that’s wrong. There’s no simple listing of the API, and attempting to interrogate services using Visual Studio’s usually excellent tools failed at the first post, with unresolveable authentication errors. In addition they seem to have moved to a REST API which is fundamentally much more difficult to drive if you don’t have a clear API listing. A lot of developers seem to be complaining about similar issues. I did find a couple of articles with sample code, but it all seems to be very complicated compared with what I remembered of the original SOAP API.

After wasting a couple of hours on "quickly knocking up a little utility" I gave up, at least for now. Back to the manual check-in method…

I’ve never been a fan of SharePoint, but it appears to be betting worse, not better. At least the first versions were simply cr@p. The new versions are very complicated cr@p.

Posted in Agile & Architecture, Code & Development, Thoughts on the World | Leave a comment