Suggestion regarding server update/PU/patches

Discussion in 'Player Support' started by Serafine, Dec 18, 2013.

  1. Serafine

    This is the current situation:

    - SOE applies a patch
    - all Servers crash (this isn't the first time right now but by far the worst)
    - SOE tries to find a solution while all servers are down
    => all people going "OMFG SERVERS DOWN! RAGE!"



    Why not do it like that:

    - SOE applies patch to ONE live server (test balloon)
    - server dies
    - SOE tries to find solution while all other live servers are still running fine
    - fixed patch (or patch and fix) is applied to all live servers
    => working patch with minimal costumer inconvenience

    Is it really so hard? Why the hell is it necessary to bug/kill all servers simultaneously?
    • Up x 3
  2. MrQuadro

    after prime time.
    • Up x 1
  3. vsae

    That is how exactly they do. Except there is only US prime time that SOE knows of.
    • Up x 1
  4. PapaHoleNUrHead

    In all fairness to SOE, and as a systems administrator for the past 23 years, they are following best practices with their patching process, and still maintain a feasible administrative control . 1) Implement patch in a test environment first, 2) allow adequate time for testing on the test environment, 3) implement patch in production environment at a time that provides least impact to end-users. It is an administrative nightmare to have to do each server one at a time, due to the "resource cost vs. benefits gained" associated with it. Using an automated process to patch systems actually decreases the likelihood of problems being introduced into the environment. And honestly, it would take several more of these instances to justify moving from this process when you consider cost vs. gain.
  5. xdox

    Not to say that I agree with what is happening but bad things can happen during patching. And as Papa said, it is far better to automate the process than do one server at a time.

    Also keep in mind that not all systems are 1 realm 1 server (I'm talking in general here, not PS2 specific), it might be even impossible to update just one server due to some changes (say, they modified something to how the server communicates with the login server or database). Other than this it should in theory be a smaller impact on the user experience when deploying an update to all systems than shutting down one (that might affect others, be it by login server take down or by pure server load) then shut them down again to deploy to the rest.
  6. Serafine

    You mean like during EU prime time? Because they ALWAYS patch during EU prime time.

    I don't say that they should patch one server after the other, I just say that maybe they should implement a test balloon if possible (like patching one server during a time nobody really cares) BEFORE the scheduled patching of all servers. So if that balloon horribly fails they can just skip the patch and try to find a solution without breaking every single server. They could even make Jaeger a live test balloon, so if it fails no real live server would be affected at all.

    And I do know they have a test server but apparently it's not working out, besides the fact that the test server tends to constantly behave different from the live servers (like all those issues with missing splash damage and stuff).
    • Up x 1