A Day in the Life: The Code

A while back, my team was asked to take over a domain from a business unit that had built one, then realized it was more difficult/complex than they thought. I was asked to do an urgent reorganization, ripping out directory access and forcing them to use the least permissions principle. We barely made the deadline, and even though I did my best to explain to them the fact that they would no longer have carte blanche in the domain, they were surprised and annoyed to find they couldn't create service accounts or do other customer onboarding tasks.

Just like I've done dozens of times before, I worked with them to define their undocumented processes, focusing on the ones for which my team would be responsible going forward. During this process, I noticed they were using a very simplistic naming scheme for their customer resources, including a 3-character alpha identifier for service accounts and groups. After asking, I was told this code was based on a single physical location of the customer (Springfield would be SPR, things like that).

This was less than ideal for two main reasons:
  1. Overlap. Even though there are technically over 17,000 combinations of three alpha characters, anyone who's worked with large-scale systems can tell you that number is very misleading. Not only can you all but throw out rarely-used characters like X and Z, but certain names are simply popular, whether you're talking about cities or people (there are dozens of Springfields in the US, nearly half a million people were named Michael in the 1990s). They hadn't hit a snag yet, but they were a small product, and continuing with that scheme would all but guarantee they'd encounter duplicate names at some point.
     
  2. It was arbitrary. They didn't use any government-endorsed abbreviation list or another authority when generating the codes. The person who did the install simply chose one based on whatever he or she thought it should be. So the code was meaningless outside of the context of that specific system in that specific environment. Granted, it would allow them to avoid the overlap issue for a time (the next Springfield could be SGF), but that would likely lead to confusion between customers in similarly-named cities.

The good part is we already had a prospective replacement for the current code. Whenever a new customer is added to our company-wide support portal, they're assigned a 4-digit number as their customer ID. That is referenced in many systems, and with all the acquisitions and mergers that have happened in the last ten years, it has been proven very useful to key off that instead of anything that will become stale with an ownership change, such as the organization name or geographic location of their headquarters.

I went over this with the business unit, and they agreed the current naming/identification scheme was sorely lacking. In a series of emails and meetings, we discussed their usages, dependencies, and what information was most important to include in what fields. During this process, I was informed that various other teams use the customer identifier to tag resources, associating them with that customer (storage, virtualization, etc.). I replied I didn't see any reason why those teams couldn't simply start using the new identifier on new resources. The business unit agreed.

We eventually came to a consensus on most things, but the customer code question was still not decided for good. I agreed to write up a process document that included both options, then went to yet another meeting to give an overview of all our decisions. I told them it was up to them, but I obviously thought the customer ID option would be the best option going forward. I said it would be ideal to have the decision made by the time we onboarded the next customer so we'd all be on the same page and have a clear way forward. That was three weeks ago.

Fast forward to yesterday. The business unit put in a couple cases requesting we create resources for new customers, marking them as an elevated priority and reaching out to me individually to check on the progress. However, I'd never received a decision on the customer code issue. I pointed this out, saying I didn't know how to create them since I didn't know what customer identifier they wanted to use. They said they'd forgotten the decision was due (!) and would get me an answer by end of day. At 4:43pm their time, they replied with their approved naming scheme, and it included the customer ID. Hooray!

But hold on. A member of one of the dependent teams was copied on the email, and he replied all that he was concerned with the adoption of a new customer identifier. His worry basically boiled down to the risk of only partial adoption of the new standard and the fact that these other teams weren't consulted about this change. I replied that since both my team and the business unit would only be using the customer ID number going forward, all resources associated with new customers would be forced to align with the new standard since 3-alpha codes simply wouldn't exist for them. As for consultation, I told him my meetings had indicated the dependent teams simply needed a customer identifier, and outside of any technical limitations, I didn't see how using this customer ID number was any different than the 3-alpha code. I've still yet to be given any information that contradicts this assumption.

The business unit continued pushing hard to get the resources created, so this morning, I relented and used the scheme they'd sent out late yesterday, even though the member of the dependent team was still hitting me up on IM saying he didn't think it was a good idea and it wasn't right that I was dictating this for all the teams. I pointed out I presented what I believed was a better scheme and had left the final decision to the business unit, which they'd made yesterday. I also pointed out everyone (even he) agreed it was a better long-term idea to key off the company-wide customer ID than an arbitrary 3-alpha code. It would take adjustment and work, but just like most infrastructure IT issues, the problem was only going to get worse with time, so we might as well do it now since we're already making changes with the process.

So, in review: I was asked to retrofit this entire domain on an insane schedule, and not only did I manage to do it, but I only pissed off the business unit a little while doing so. And when those concerns were voiced, I spent a considerable amount of time getting the full picture of their needs and requirements, suggesting changes that everyone agreed represented significant long-term improvements. Instead of being draconian, I left the most important decision to the business unit, and they forgot they needed to make it until it was almost an emergency. When I finally got sign-off, someone came out of the woodwork to complain, accusing me of being draconian and not taking everyone's needs into account. For all I know, they're back-channeling with the business unit to get the decision reversed.

All over a 3-character code in one type of service account in one domain.

Still want to be a sysadmin, kids?

Comments