About 4 years ago, I was a (very) junior embedded software developer at a local company. I had been assigned my first solo project, I was to make a small interface board that would store data in memory and then transmit it upon a button press through an IrDA interface. It was actually a short state machine in which the first button press would store the data and the second button press would transmit it and receive an acknowledge that it had been received.
So I worked with a hardware engineer to develop the small board, selected MCU and other components and went ahead and started designing and coding. Some time later I had a working prototype and was very happy and proud about it. I was explaining to another engineer how my code was supposed to work, when a field technician walked by and started listening to us. He then said that he would be the one delivering the first prototypes to the test sites so it would be a good idea if he got to test it before going there. You should know that technicians are not regarded at the same level of divinity as development engineers (although this guy, a good friend by the way, had more than 7 years experience in the field), so I smugly said: “ha! my code is impervious to anything”, and handed the prototype over to him. He pressed the button, the familiar beep that indicated that data had been read sounded. He pressed the button again, it didn’t stop sounding as there was no device to acknowledge reception in sight. I smiled proudly. Then he clicked the button a couple of times and fast, it stopped beeping. He laughed, handed it to me and went on his way. Upon further exploration I found I had missed a race condition with the keyboard interrupt and the code was susceptible to “double clicks”.
So a couple of weeks later I was a humbler engineer, with 5 prototypes ready for field testing and I planned to go and see the testing for myself (as I was humbler and now wanted to learn from the experienced technician). I cannot give too much detail on the application, but I can say that it was to work outside. So we turn on the little board, and start the testing. Button press, read data, beeping, press again pointing to the IrDA port…beeping doesn’t stop, but the receiving device says that it has actually already received the data. Second test, no reception of data and beeping doesn’t stop. Several tries later, one of those two conditions happened, no success. Then we noticed something, if you put my device very close to the IrDA port, everything would work, so I started thinking maybe the IrDA hardware didn’t have enough power, went to my laptop to look at data sheets and find out everything I could. Meanwhile my technician friend just kept thinking and after a few minutes he said: “it’s the sun”. At first I thought he was on drugs or something, but then it hit me, the sun was interfering with the communications (remember I mentioned this application was for the outdoors?) Turns out he had run into a similar problem some time before on another similar product. I’d completely disregarded the place where the device would exist.
Some time later I had the chance to be involved in developing the development processes for that company, I made sure the new design considerations included external environment. Never think your application will live in vacuum.