You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 4 Next »

MIT Touchstone Project Planning


Goals:

Transition the IdPs to Shibboleth 2.1.4 release.

Phase One: Transition core MIT IdPs (idp.mit.edu)


Hardware

Idp1 and idp2 are running on RHEL3 physical machines. NIST has also provided idp2-dev, which is also a RHEL3 machine. Bob has been using foonalagoona which is provided by OPS/AMIT. This is not a RHEL3 machine. To complete the transition new RHEL5 VMs will be requested from NIST:
 

    1. 1 dev machine
    2. 2 staging machines
    3. 2 production machines
    4. Configuration:
      1. minimum RAM 2GB, suggest 4GB
      2. at least 10Gb disk, 7200 RPM
      3. Switch Federation recommends on a physical machine the CPU should 4 cores, each running at 2GHz. It has been noted that IdPs tend to be CPU bound, not disk io or network bandwidth intensive.

 
Once the transition to the new IdPs has been completed the following physical machines will no longer be needed by Touchstone:

    1. Idp1
    2. Idp2
    3. Idp2-dev

 
Once the transition to the new IdPs has been completed the following virtual machines will no longer be needed by Touchstone:

    1. Idp1-staging.mit.edu
    2. Idp2-staging.mit.edu
    3. Foonalagoona.mit.edu (OPS/AMIT)

Develop login page(s) that support multiple mechanisms, without using Stanford WebAuth.

  1. Authentication mechanisms:
    1. Username/password (done) urn:oasis:names:tc:SAML:2.0:ac:classes:PasswordProtectedTransport
    2. X.509 certificates (via Apache mod_ssl) urn:oasis:names:tc:SAML:2.0:ac:classes:SoftwarePKI
    3. Kerberos via http-spnego (via modgssapache) urn:oasis:names:tc:SAML:2.0:ac:classes:Kerberos

The login page that presents all three of these mechanisms will be written in JSP. Work estimate is 2 to 3 weeks to have a proof of concept page.

High Availability plans

ShibHA is not available for the 2.x IdPs. The recommendation is to use Tomcat Terracotta. As of October 18th Bob had not started working with Terracotta. Bob estimates that he will need approximately 2 weeks to become familiar with Terracotta configuration issues.
 
Paul will request two test machines from NIST (RHEL5 VMS). These will start as test machines and become the new staging machines.
 
DNS round robin is being used today, and we plan to continue using this for next phase of the project, despite Internet2s recommendation to use a hardware load balancer. We wish to avoid performing an SSL termination at the F5, especially since usernames and passwords are being sent to the IdPs in some cases.

Migrate from SP attribute query to IdP attribute push.

This means that the user’s attributes will be included with the authentication assertion that is returned to the SP in the initial POST transaction. This will reduce one network round trip between the SP and the IdP.
 
Bob has confirmed that the existing SPs will accept the push without requiring any changes to be made to any of the existing SP configurations. We will be removing the artifact query support from the MIT metadata. Need to determine when this change will be made.
 
(completed, week of 10/19/2009)Bob will check to make sure that no SPs are using artifact query, and perform test to ensure that the removal does not break anything.

Pre-idp 2.1 deployment changes to the existing 1.3 IdPs

  1. Update the Apache server on the existing 1.3 IdPs to include the mod_rewrite module. This will require a rebuilding of Apache on RHEL3. Idp2-dev will be used.  This is expected to take approximately one week.
  2. A second instance of the 1.3 IdP software will be installed on the existing IdPs. The new instances will support the new endpoint naming convention.
  3. Idp2-dev will be used for testing this prior to deployment to either staging or production.
  4. Test phase:
    1. This will include the new IdP instances, with the new endpoints running on the IdP staging servers.
    2. We will want some of our most heavily used SP customers to help with testing. We would like Stellar and Wikis staging and test instances to point to the core staging environment during this phase.

IDP 2.x deployment issues

entityIDs

There are currently two entityIDs or proiderIDs that are used to describe the core MIT IdPs. Within InCommon our entityID is urn:mace:incommon:mit.edu Within the campus federation our entityID is *https://idp-mit-edu.ezproxy.canberra.edu.au/shibboleth*. A single entityID can be used in two different federations. However, when doing so it is important to keep the data identical in the two different metadata files.


https://spaces.internet2.edu/display/InCCollaborate/IdP+entityID+Shift+to+URLs+--+FAQ indicates that new IdPs should use a URL style entityID. However, it also suggests that existing URN style entityID should not be migrated. It points out, “Changing an entityID may cause service disruption and require changes at many partner SP sites.  It is usually more important for entityIDs to remain stable.”

We should strongly consider ignoring this recommendation and migrating to the URL form of entityID within the InCommon metadata.

deployment migration and temporary SSO disruption

We’re thinking about adding the new IdPs to the existing idp.mit.edu DNS round robin. It should be understood that there will be no state sharing between the 1.3 IdPs and the 2.x IdPs. To a certain extent this will affect SSO. For users that have configured their browsers to always use a certificate, or always use Kerberos, there should be no visible change in behavior while the 1.3 and 2.x servers are both running.


For users that don’t have the mechanism automatically selected, but always click on certificates, or Kerberos, they may be presented with the login screen twice, during a browser session.


The same is true for people that use username and password. They should only end up being prompted for their username and password one extra time during a typical browser session. Many users will not see a change in behavior.


Once we have confidence in the new IdPs, the 1.3 IdPs will be taken out of the DNS pool and taken out of service. How long the 1.3 and 2.x IdPs should be allowed to run concurrently is open to debate. Perhaps only an hour, perhaps a couple of days, I expect that we will have a better idea once we have done this in the staging environment first. .


Should the DNS TTL be lowered, during the time that both the 1.3 and 2.x IdPs are running concurrently?


Alternatives:

Bring up the new IdPs under a new DNS name, and add them to the MIT-metadata. As SPs take the new metadata, they will start using the new IdPs.


Note that this technique is not recommended by the Shib-user community or the Internet2 wiki. It can lead to a long transition time, and it is difficult to backtrack quickly if there are any problems. The best behaved SPs tend to only update their metadata once a day, many only update manually.


We could “throw the switch” and shut down the old IdPs and bring up the new IdPs during one scheduled short down time. This would require a short interruption of service. If there are problems with the 2.x IdPs it will mean there will be other interruptions of service.

Phase Two: Transition TouchstoneNetwork.net IdPs All of the work to be done to migrate the core IdPs is also applicable to the TouchstoneNetwork IdPs, with the exception to the work required to eliminate Stanford WebAuth from the core IdPs. We expect that once the core IdPs have been transitioned, it will simply be a matter of applying nearly the same configurations to the new CAMS IdP machines, and testing them. I expect that this phase of the transition can be done in approximately two weeks of time.

Phase Three: improve SP registration

SP registration currently requires a person to send mail to an RT queue for processing. The person has to understand what type of information is required. Someone (Bob) has to edit the MIT-metadata.xml file with the submitted information in order for the registration to become effective.

Phase four: Misc

a.       InCommon is strongly advocating the use of inline certificates, i.e. inline in the metadata. This will mean that if MIT SP use a single certificate for the user facing SSL and for Shibboleth, when the certificate has to be renewed, the system administrators will also have to register the new certificate in the MIT, and potentially the InCommon Federation, metadata.

Projected timeline Week of October 26 (~3 days available)

·         Request new VMs from NIST

·        Remove the artifact query support from the MIT metadata.

·         Schedule CAMS restart with new Moira WS settings

·         Work on new login pages (elimination of Stanford WebAuth)

·         (2 days on Athena)

Week of November 2nd (4 days available)

·         Work on new login pages (elimination of Stanford WebAuth)

·         (1 day on Athena)

Week of November  9 (3 days available)

·         Work on new login pages (elimination of Stanford WebAuth)

·         Work on Apache rebuild for RHEL3

·         (2 day on Athena)

Week of November  16 (3 days available)

·         Work on Apache rebuild for RHEL3

·         Install second instance of 1.3 IdP on idp2-dev.mit.edu

·         (2 days on Athena)

Week of November  23 (1 day available)

·         TH and Friday holidays

·         Wed travel?

·         (1 day on Athena)

·         Idp2-dev configuration and testing

Week of  November 30 (2 days available)

·         Start Terracotta investigation

·         (3 days on Athena)

Week of December 7 (0 days)

·         (5 days on Athena)

Week of  December 21 (2 days available?)

·         Terracotta

·         (Athena)

·         (Holidays)

Week of December 28 (  3 days available?)

·         Terracotta

·         (1  Day Athena)

Week of January 4th (5 days available)

·         Terracotta

·         Start testing Deploy week of January  25th

Deploy to CAMS week of February 22                



  • No labels