home :: engineering

Thu, 19 Apr 2007

Yes, It Is Powerpoint, But...
Jacob's has posted the deck for his Web2.0 Expo talk on Geographic Distribution for Global Web Application Performance

Tags: web on technorati, delicious, netscape, google

Last Updated: 04/19/2007 21:39   by Richard   | | Filed in: [/engineering]

Sat, 20 Jan 2007

We've All Got Our Problems
Yahoo was a leader in many areas - search, portal, messaging. But as they've aged, their engineering teams are beginning to suffer some of the same problems as the rest of us. Cutting-edge platforms (at one time), often of a proprietary nature, need a hard look and difficult, often expensive, choices need to be made about the continued use of those platforms.

A Yahoo insider comments on their dead-end infrastructure:

And let me tell you this. Yahoo! is now rotten from the inside out. Here's my take of how to fix Yahoo!'s engineering:

... 4) Slowly port all Yahoo! software to linux and phase out FreeBSD. Start supporting and encouraging multi-threading programming. I bet Google is laughing their asses off at us because we are still stuck with FreeBSD, gcc-2.95 and single process model.

...

5) Slowly get rid of all Yahoo-specialized open source software. Why do we have "YApache" (based on Apache 1.3), and why do we have the dreaded yut/ycore++ libraries when we can use STL and boost? And why do we have YPAN when we can just use CPAN??? The platform group is doing the wrong job supporting this dead-end infrastructure.

Tags: engineering on technorati, delicious, netscape, google

Last Updated: 01/20/2007 10:45   by Richard   | | Filed in: [/engineering]

Sun, 24 Sep 2006

Transformation: Bringing The Problem To Heel
The next set of activities in the engineering design process can be called "transformation." That is, the effort to transform the problem at hand into something solvable, moving from the unfamiliar to the familiar. We're adding progressively more structure as we move through the phases.

In this phase, we may begin to see pressures from the customer to produce. Be sensitive to that, while relating the problems to the long-term strategy or architecture framework under which you work. After the problem analysis, the customer may expect a fully-formed solution to just pop out. Manage those expectations early in the design process.

The guiding question in this phase is, "Have we solved this problem or something similar to this before?"

By answering that question, we decide whether we will be able to leverage prior efforts, avoid prior mistakes, or perhaps do both. Both of these allow us to move the problems toward solutions more quickly and hopefully in a more cost-effective way than blazing a new trail, although that too might be required.

To leverage prior efforts, we have to discern patterns in the problem that fit the work we've done before. Some examples of patterns in the SysAdmin world might be batch-processing, sessions, synchronous versus asynchronous communication, or layered security. These patterns are part of the problem domain that we noted in the analysis phase. The domain will have basic classes of objects and formalizes the heirarchy of those objects. As I noted in the first article in this series, the body of knowledge that might document this domain heirarchy is still not fully mature in much of the IT industry.

This problem seems brand-new, something you've never tackled? Working with the customer, manipulate it a bit:

Can the problem be:

  • Redefined - Do you need to design a dedicated widget or service to fit the need? Does the customer know that we already offer a similar service, scaled for X (where X is some number of transactions/time, amount of user traffic, amount of records) and ready today? Examples might be authentication, mail transport, indexing/search functions, or publishing.
  • Rearranged - Can we "start small?" Do we commit full resources to the solution if the outcome (traffic estimates, sales, customer uptake, etc) are unknown or sketchy in some way? Can some features go into a "version 2.0?"
  • Regrouped - Given the answers to the two questions above, can we consolidate or change the groupings of the sub-problems to fit a known pattern? Is batch-processing sufficient or do we need real-time output?

At the end of this phase, you now have all of the definition and analysis necessary to begin idea development. You have eliminated duplications and are able to leverage prior work and existing systems. With this, you've reduced your problem size and hopefully complexity.

Past articles in this series:

  1. Analysis - Breaking It Down
  2. What Is It We're Doing Here? - Applying Engineering Principles To System Administration
  3. Who Is It? - Defining Who Might Benefit

Technorati Tags:

Tags: on technorati, delicious, netscape, google

Last Updated: 09/24/2006 15:35   by Richard   | | Filed in: [/engineering]

Thu, 07 Sep 2006

Analysis: Breaking It Down

A well-executed and thoughtful analysis of the problem at hand, provided by a knowledgeable set of professionals can be extremely useful to your project. Time spent now analyzing the parts of problem and devising a methodology for solving them will be handsomely repaid later in reduced re-work, altered and slipped schedules, and feature creep. All of those bad things mean one thing: money. Avoid or mitigate them and you'll save money.

How do we do that analysis?

The first step in the this phase should be the decomposition of the problem into sub-problems. You might use a tool like an outliner or Visio to begin to sort the sub-problems or build a hierarchy of items. A visual representation of the sub-problems and how the problems relate to others up the chain can be useful. After a number of these exercises, your company may have a set of analysis procedures which standardize this process. If not, it should; don't reinvent the wheel each time.

How do you know when you've broken the problem down far enough? To decompose a problem into useful sub-problems, someone familiar with the domain of the problem should be engaged.

Suppose my website needs a shopping cart for on-line retail sales. Someone knowledgeable in the field of e-commerce would know that email delivery of order confirmations and shipment status is a de facto requirement in this space. Knowing that, we'd add "email delivery" or perhaps, more broadly "customer communication" as a sub-problem. We'd then engage someone who is knowledgeable in the delivery of mail at our expected volume.

This is a vastly simplified example, but the customer may not have the depth of knowledge to even articulate that portion of the problem beyond "yeah, we need to send receipts to our customers". At this point, if the subproblem lives in one domain and can be solved as a unit, we've probably reached the bottom of this sub-problem and need no further decomposition. This is a somewhat tough thing to teach and comes from the experience in the domain.

As we document sub-problems, it is important to keep focus and to relate every sub-problem back to the main problem at hand. In our shopping cart example, it might be interesting or even technically sexy to explore delivery of customer communications via alternative methods such as SMS, for example. However, someone knowledgeable in industry standards and best practices for the problem domain can tell you this is not a common practice or expectation and is really only peripherally related to the problem at hand: processing the sales of widgets on your website.

Here you may encounter the familiar tension between "engineering" and the business or marketing groups pushing for new gee-whiz features, when the design costs for them may be prohibitively expensive and any real-world utility suspect.

The last pass over the sub-problems should be used to arrive at a rough timetable and sequence for attacking each subproblem. The sum of the time to complete the sub-problems, acknowledging that some may be done in parallel, while others have a serial nature, may be weighed against the deadlines or time-to-market (ah, there's a buzzword) goal.

What are we trying to get out of the exercise?

Remember also, at this point, we're only documenting the problems, not solving them or writing project plans to implement possible solutions.

At the end of this work, you should have the following:

  • A coherent statement of the problem
  • Forecast of metrics, applicable dates, and financial bounds for solving the problem
  • A set of sub-problems, categorized in a useful way
  • A set of sub-problems in manageable, solvable chunks
  • Techniques that may be applied to solving each sub-problem
  • Identification of personnel (or gaps in knowledge) in your organization who can solve each sub-problem
  • A timeline and level-of-effort for solving the problem
  • A plan for reporting on the progress of solving each sub-problem

Next time, we'll review the sub-problems in the Transformation phase, to help us make use of past solutions and patterns from similar problems, so that we can arrive at cost-effective and standard solutions.

Hit me with some comments, if you've got techniques or suggestions for boiling problems down in their basic components.

Past articles in this series:

  1. What Is It We're Doing Here? - Applying Engineering Principles To System Administration
  2. Who Is It? - Defining Who Might Benefit

Technorati Tags:

Tags: on technorati, delicious, netscape, google

Last Updated: 09/07/2006 21:16   by Richard   | | Filed in: [/engineering]

Sun, 03 Sep 2006

Who Is It?

Before proceeding on to the next item in the engineering design process, I'm taking David's (of Habiblog) question about my definition of 'system administrator'.

Again, depending on whom you ask, the term 'system administrator' can apply to many skill-sets or job roles. There's quite a range in skills and responsibility, often loosely related to the size or industry of the installation. The guy working to keep the systems running at a small company of 40 people has to be the jack of all trades, especially if the company is not in IT. The guy supporting a manufacturer of children's clothing may not have the budget to spend deeply on 'enterprise-class' gear or have dedicated and redundant hosts. He'll run hosts longer and harder (because he has to), and in many cases, it will be gear that has names like 'Digital' or 'IBM System xx' stamped on it. He'll twiddle with EDI, keep an eye on the frac-T1, answer questions on Win95 (still!), and wonder if he can scrounge a box to try out Linux and Samba. This is where many of us started.

However, the guy (likely with a team) supporting yahoo.com or aol.com has a larger budget, technology at hand that improves his job, a more specialized focus, and a commitment to some goal, be it traffic numbers, availability, search query response times, etc. Few of us start here.

There's a continuum in between; folks who support desktop users at a university, the girl who runs the Beowulf cluster at Sandia, and the great-UnixHead, who lives only to rack, compile, and tune.

What I am getting at is the fact, and you see this in the dying throes of SAGE as it is extruded from USENIX and the fitful self-consciousness in LOPSA (which, by the way, hasn't yet made me System Administrator of the Week), is that this a profession that is relatively young and is looking to mature. The frantic pace of technological change and industry upheaval (boom and bust cycles), has left precious little time for the profession to arrive at a body of knowledge. But I do see the emergence of leaders in the field, technologists who can contribute their considerable energies and skills to advancing the profession.

So, in answer to the question, "what is your definition of a 'system administrator' [in this context]?", I have to place the most emphasis on the sysadmin's ability to decide. Does the sysadmin have the skills, the management backing, and the resources to make his work strategic and to improve (insert overused term 'architect') his installation by hewing to engineering practices? In summary, I'm thinking of the SAGE Level IV system administrator.

Appropriate Responsibilities

  • Designs/implements complex local and wide-area networks of machines.
  • Manages a large, complex site or network.
  • Works under general direction from senior management.
  • Establishes/recommends policies on system use and services.
  • Provides technical lead and/or supervises system administrators, system programmers, or others of equivalent seniority.
  • Has purchasing authority and responsibility for purchase justification.

Technorati Tags:

Tags: on technorati, delicious, netscape, google

Last Updated: 09/03/2006 21:44   by Richard   | | Filed in: [/engineering]

Sun, 27 Aug 2006

What is it, man?
Depending on whom you ask, the very nature of system administration varies. Is it art? Black art? A profession advanced through the apprenticeship model? Are we smithies? Many would argue, especially the virus fighter or self-healing system designer, that system administration is a science, with natural rules and reactions.

Others might suggest, and often do, that we're somehow engineers. I'd argue that most of us are closer to locomotive engineers than to the trained and licensed engineering professional, one who uses repeatable processes, measurement, and design; the hallmarks of the engineering profession.

In the next few articles, I'll be walking the border between the province of the engineer and the shire of the common system administrator. I'll be using some textbook definitions of engineering terms and applying them to the things we do in our daily work. I do this not as an indictment of my very own profession, but in hope of peeling back the layers of myth and guesswork so prevalent in designing and running large computer systems. We've made tremendous strides in some areas, so all is not lost.

Many of us know our systems and applications deeply, and like our children, we know when they'll misbehave. To me, though, that's not a repeatable or easily teachable relationship or one that serves us well in the engineering or business worlds, where certainties such as total cost of ownership and project completion dates are expected.

Why do this? I'm interested in advancing our cause and strengthening the differentiation in unskilled vs. skilled, between the carpenter and the guy that built Taipei 101. Rather than simply stoking the fires and twisting the throttle, I'd like to see us closer to being CAPCOM, coolly wearing the vest and shepherding the mission (our system) that was scoped well, designed correctly, and responds in known and predictable ways.

We can improve the use of tools and measurement and make decisions based on hypotheses confirmed with data, rather than gut-feelings. The outcome is bound to be better systems, ones certainly better in cost of ownership and efficiency. If we've designed systems correctly, with a nod to modularity and standards, they'll scale better and our time-to-market will improve. Much of this is self-apparent to many of you, I'm sure. It hasn't reached everyone yet.

By way of definition, I'm using the terms "system" and "computer system" to describe a set of computers, processes, and interconnections, typically in a large installation supporting large corporations or government agencies. These are places where volume and traffic require true efficiency and scalability. I'm not describing the world of the Electrical Engineer, designing the actual boards or integrated circuits, nor are we hoping to apply much thought to setting up a print server for the twenty-five person company. We're not interested in the heat generated by the CPU or the hard disk, except in the most macro way, i.e. "how do we minimize cost, be that in energy consumption or HVAC, with our implementation?"

Let's start off with the engineering design process. I'm using an engineering textbook, Engineering: An Introduction to a Creative Profession [Beakley, Evans, Keats, 1986]. The specific naming of the stages is not so important and perhaps will vary according to the problem or system at hand, but the process itself lends us structure and provides a known path to progress from Need to Project Plan.

Stages of Engineering Design

  1. Identification of the problem
  2. Analysis
  3. Transformation
  4. Idea Development
  5. Modeling
  6. Information Gathering
  7. Experimentation
  8. Synthesis
  9. Evaluation and testing
  10. Presentation of the solution

Phew! That's quite a list of things to do before you build that new mail server, right? Well, that mail service is the thing that needs the process, not your lowly server. The ground-up design of the system, following industry and de facto standards is the time and the place for the steps shown above. Let's delve into these a bit, shall we?

Identification

What is it that is needed? Seems simple, right? Requirements gathering and scoping are two of the major gotchas in this business. The marketing and business people come bearing grand, shiny ideas, marveling at their own cleverness, only to be ground down by the reality, the lack of support from legacy backend systems, and the endless questions from the 'engineers', who are now seen as a roadblock to be bypassed.

This is where we must perservere; to dig, to not take things at face value, to ask the same questions in different ways. You must not only ask what is needed, but what is not needed. In helping the customer to fully identify the problem, we should consider some of the following areas and constraints in this first phase.

  • Can the purpose of the system be reduced to a single sentence or paragraph? Have we clarified to that level?
  • What is the scope of the problem? How many users, customers, data sets, or records are affected?
  • What are the interconnections, data flows, and processes that will feed our new system or allow it to feed other systems?
  • Can we identify, even in rough terms, the availability of funding and management commitment to the project? Is the system part of our charter or corporate mission?
  • What are our legal and ethical constraints for this system, its data, and its users?
  • What is the expected outcome of this design exercise? Are we simply studying the feasability of solving the problem or actually moving to do that work?

Write these things down. The use of standardized forms or questions can help guide and document the discussions, but don't rely on them alone. Repeat back your understanding of the problem and constraints to the customer, letting him hear what you understand to be the problem. One other useful tactic in discussions of upgrading or replacing systems can be a review of the limitations of the old system.

With proper problem identification, you'll have formed an agreement with all concerned over what is actually to be fixed, designed, or upgraded, thereby reducing feature-creep or mismatched expectations later in the project, when refactoring or modification will be expensive and disruptive. Don't skimp at this stage.

Next time, we'll jump into the analysis stage for some thoughts on problem decomposition and application of known data to the problem that we've identified.

If you've got suggestions or maybe comments on how I've gotten it all wrong, hit the Comments link below and let's hear it.

Technorati Tags:

Tags: on technorati, delicious, netscape, google

Last Updated: 08/27/2006 20:04   by Richard   | | Filed in: [/engineering]