Monday, December 19, 2016

Single Sign On with SAML 2.0

Security Assertion Markup Language or SAML is the secure XML based communication standard for communicating identities, exchanging authentication and authorization data between parties. SAML is a specification which defines messages and their format, message encoding methods, message exchange protocols, and other recommendations. SAML addresses the primary use case of internet single sign on (SSO) which authenticates the user using a single login into the system and allows access to other affiliated systems without additional authentication. SAML thus eliminates multiple authentication credentials in multiple locations and reduces the number of the users being authenticated. It separates the security framework from platform architecture and specific vendor implementation. SAML involves three entities, the user, identity provider and server provider. The Identity Provider maintains a directory of users and an authentication mechanism to authenticate them. The Service Provider is the target application that a user tries to use.

SAML consists of six components as follows: assertions, protocols, bindings, profiles, metadata, authentication context. The components mainly enable to transfer secure information like identity, authentication, and authorization information between trusted entities.


SAML assertions contain identifying information made by a SAML authority. In SAML, there are three assertions: authentication, attribute, and authorization. Authentication assertion validates that the specified subject is authenticated by a particular means at a particular time and is made by a SAML authority called an identity provider. Attribute assertion contains specific information about the specified subject. And authorization assertion identifies what the specified subject is authorized to do.
SAML protocols define how SAML asks for and receives assertions and the structure and contents of SAML protocols are defined by the SAML-defined protocol XML schema.
SAML bindings define how SAML request-response message exchanges are mapped to communication protocols like Simple Object Access Protocol (SOAP). SAML works with multiple protocols including Hypertext Transfer Protocol (HTTP), Simple Mail Transfer Protocol (SMTP), File Transfer Protocol (FTP) and so on.
SAML profiles define constraints and/or extensions to satisfy the specific use case of SAML. For example, the Web SSO Profile details how SAML authentication assertions are exchanged between entities and what the constraints of SAML protocols and binding are. An attribute profile on the other hand establishes specific rules for interpretation of attributes in SAML attribute assertions. For instance, X.500/LDAP profile details how to carry X.500/LDAP attributes within SAML attribute assertions.
SAML metadata defines a way to express and share configuration information between SAML entities. For instance, an entity's supported SAML bindings, operational roles (IDP, SP, etc), identifier information, supporting identity attributes, and key information for encryption and signing can be expressed using SAML metadata XML documents. SAML Metadata is defined by its own XML schema. In a number of situations, a service provider may need to have detailed information regarding the type and strength of authentication that a user employed when they authenticated at an identity provider.
SAML authentication context is used in (or referred to from) an assertion's authentication statement to carry this information. A service provider can also include an authentication context in a request to an identity provider to request that the user be authenticated using a specific set of authentication requirements, such as a multi-factor authentication.

SAML Assertion
An assertion is a package of information that supplies zero or more statements made by a SAML authority usually about a subject such as a user. SAML assertions are issued from the Identity Provider(also called Asserting Party) to the Service Provider (also called as Relying Party). When the user has authenticated with the Identity Provider a SAML Assertion is sent to the Service Provider with the Identity Provider's information about that user. It is represented by the <Subject> element. SAML specification defines three different kinds of assertion statements that can be created by a SAML authority. All SAML-defined statements are associated with a subject. The three kinds of statement defined in the specification are:
  • Authentication: The subject was authenticated by a particular means at a particular time.
  • Attribute: The subject is associated with the supplied attributes with values mostly using LDAP.
  • Authorization Decision: A request to allow the subject to access the specified resource has been granted or denied using the given evidence.
SAML authentication request protocol enables third-party authentication of a subject. It is useful in the cases to limit the scope within which an identifier is used to a small set of system entities.

Name Identifiers
Name Identifiers are identifiers for subjects and the issuers of assertions and protocol messages. They help to establish a means by which parties may be associated with identifiers that are meaningful to each of the parties. They help to limit the scope within which an identifier is used to a small set of system entities. Two or more system entities may use the same name identifier value when referring to different identities. SAML provides name qualifiers to disambiguate a name identifier by effectively placing it in a federated namespace related to the name qualifiers. The <BaseID> element is an extension point that allows applications to add new kinds of identifiers. The NameIDType complex type is used when an element serves to represent an entity by a string-valued name. Its more restricted form of identifier than the <BaseID> element and is the type underlying both the <NameID> and <Issuer> elements. The <NameID> element is of type NameIDType, and is used in various SAML assertion constructs such as the <Subject> and <SubjectConfirmation> elements, and in various protocol messages. The <EncryptedID> element is of type EncryptedElementType, and carries the content of an unencrypted identifier element in encrypted fashion. The <Issuer> element, with complex type NameIDType, provides information (name etc) about the issuer of a SAML assertion or protocol message.

Assertions has following elements:
  • The <AssertionIDRef> element makes a reference to a SAML assertion by its unique identifier. The specific authority who issued the assertion or from whom the assertion can be obtained is not specified as part of the reference. 
  • The <AssertionURIRef> element makes a reference to a SAML assertion by URI reference.
  • The <Assertion> element is of the AssertionType complex type and specifies the basic information common to all assertions, such as version issual time, identifier, issuer, signature, subject, authentication etc.
  • The SAML assertion MAY be signed adding the <ds:Signature> element, which provides both authentication of the issuer and integrity protection.
  • The <EncryptedAssertion> element represents an assertion in encrypted fashion.

The Subjects section defines the SAML constructs used to describe the subject of an assertion. The optional <Subject> element specifies the principal that is the subject of all of the (zero or more) statements in the assertion. It identifies the subject using <BaseID>, <NameID>, or <EncryptedID> and confirms it using <SubjectConfirmation> element. A <Subject> element can contain both an identifier and zero or more subject confirmations which a relying party (service provider) can verify when processing an assertion. A <Subject> element SHOULD NOT identify more than one principal.
  • The <SubjectConfirmation> element provides the means for a relying party (service provider) to verify the correspondence of the subject of the assertion with the party with whom the relying party is communicating. It has the method which identifies a protocol or mechanism to be used to confirm the subject.
  • The <SubjectConfirmationData> element has specifies additional data that allows the subject to be confirmed or constrains the circumstances under which the act of subject confirmation can take place. The KeyInfoConfirmationDataType complex type constrains a <SubjectConfirmationData> element to contain one or more <ds:KeyInfo> elements that identify cryptographic keys that are used in some way to authenticate an attesting entity.

The <Conditions> element place constraints on the acceptable use of SAML, such as Validity, Audience Restriction, Usage and Proxy Restrictions.
The <Advice> element contains any additional information that the SAML authority wishes to provide to a relying party (service provider).
The <Statement> element is an extension point that allows other assertion-based applications to reuse the SAML assertion framework.
The <AuthnStatement> element describes a statement by the SAML authority asserting that the assertion subject was authenticated by a particular means at a particular time.
The <SubjectLocality> element specifies the DNS domain name and IP address for the system from which the assertion subject was authenticated.
The <AuthnContext> element specifies the context of an authentication event and can contain authentication context class reference, an authentication context declaration or declaration reference,
or both.
The <AttributeStatement> element describes a statement by the SAML authority asserting that the assertion subject is associated with the specified attributes. Assertions containing <AttributeStatement> elements MUST contain a <Subject> element. It has <Attribute>, <AttributeValue>, <EncryptedAttribute> and <AuthzDecisionStatement> elements.

SAML protocol messages can be generated and exchanged using a variety of protocols. The protocols defined by SAML achieve the following actions:
  • Returning one or more requested assertions. This can occur in response to either a direct request for specific assertions or a query for assertions that meet particular criteria.
  • Performing authentication on request and returning the corresponding assertion.
  • Registering a name identifier or terminating a name registration on request.
  • Retrieving a protocol message that has been requested by means of an artifact.
  • Performing a near-simultaneous logout of a collection of related sessions (“single logout”) on request.
  • Providing a name identifier mapping on request.

RequestAbstractType
All SAML requests are of types that are derived from the abstract RequestAbstractType complex type. It has following attributes:
  • ID: An unique identifier required for the request.
  • Version: The version of this request.
  • IssueInstant: The time instant of issue of the request encoded in UTC.
  • Destination: An optional URI reference indicating the address to which this request has been sent in order to prevent malicious forwarding of requests to unintended recipients.
  • Consent: An optional indicator which indicates that the consent has been obtained from a principal in the sending of this request.
  • Issuer: An optional <saml:Issuer> identifies the entity that generated the request message.
  • Signature: An optional <ds:Signature> that authenticates the requester and provides message integrity.
  • Extensions: The extension point contains optional protocol message extension elements that are agreed on between the communicating parties.

StatusResponseType
All SAML responses are of types that are derived from the StatusResponseType complex type. It has following attributes:
  • ID: An unique identifier required for the response.
  • InResponseTo: A reference to the identifier of the request to which the response corresponds.
  • Version: The version of this response.
  • IssueInstant: The time instant of issue of the response encoded in UTC.
  • Destination: An optional URI reference indicating the address to which this response has been sent in order to prevent malicious forwarding of responses to unintended recipients.
  • Consent: An optional indicator which indicates that the consent has been obtained from a principal in the sending of this response.
  • Issuer: An optional <saml:Issuer> identifies the entity that generated the response message.
  • Signature: An optional <ds:Signature> that authenticates the responder and provides message integrity.
  • Extensions: The extension point contains optional protocol message extension elements that are agreed on between the communicating parties.
  • Status: The <Status> is the required code representing the status of the corresponding request. It contains <StatusCode> representing the status of the activity, <StatusMessage> and <StatusDetail> having additional information concerning the status of the request. The status code can be successful with code as "urn:oasis:names:tc:SAML:2.0:status:Success" or varying failure codes as in the specification.
The existing assertions can be requested by uniquely identified reference using <AssertionIDRequest> or queried for assertions by subject or statement type using the <SubjectQuery>, <AuthnQuery>, <RequestedAuthnContext>, <AttributeQuery> and <AuthzDecisionQuery> elements.

The <Response> message element which is an extension of StatusResponseType is used when a response consists of a list of zero or more assertions that satisfy the request. It has additional <saml:Assertion> or <saml:EncryptedAssertion> which specifies an assertion or encrypted assertion by value. In response to a SAML-defined query message, every assertion returned by a SAML authority must contain a <saml:Subject> element that matches the <saml:Subject> element found in the query. The identifier element (<BaseID>, <NameID>, or <EncryptedID>) or at least one <saml:SubjectConfirmation> element must match between the <saml:Subject> elements of the query and its response.

Authentication Request Protocol
When a principal wants to obtain assertions containing authentication statements to establish a security context at one or more relying parties, it uses the authentication request protocol to send an <AuthnRequest> message element to a SAML authority. The returned <Response> message which contains one or more assertions must have at least one assertion which contains at least one authentication statement. Initially the Requester creates the authentication request. The Presenter then sends the <AuthnRequest> message providing the properties required for resulting assertion to the identity provider and either authenticates itself to the identity provider or relies on an existing security context to establish its identity. The process of authentication of the presenter may take place before, during, or after the initial delivery of the <AuthnRequest> message. The <AuthnRequest> message is mostly signed or authenticated by the protocol binding used to deliver the message. An Identity Provider provides identifiers for users looking to interact with a system, and issues an assertion along with an authentication statement. The request presenter ideally is the attesting entity which satisfies the subject confirmation requirements within the <SubjectConfirmation> elements of the resulting assertion. The responder replies to an <AuthnRequest> with a <Response> message either containing one or more assertions meeting the specifications defined by the request or a <Status> describing the error occurred. The presenter can be directed to another identity provider by the responder while issuing its own <AuthnRequest> message, so that the resulting assertion can be used to authenticate the presenter to the original responder. The returned assertion(s) contains a <saml:Subject> element which represents the presenter. The identifier type and format are determined by the identity provider. At least one statement in at least one assertion is a <saml:AuthnStatement> which describes the authentication performed by the responder or authentication service associated with it. The resulting assertion(s) also contains a <saml:AudienceRestriction> element referencing the requester as an acceptable relying party (service provider). The Relying Party consumes the assertion(s) to establish a security context and to authenticate or authorize the requested subject in order to provide a service.  Identity provider may skip the creation of a new <AuthnRequest> for the authenticating identity provider, when authenticating the same presenter for a second requester.

The AuthnRequest element extends from RequestAbstractType and adds the below additional attributes.
  • <saml:Subject>: The optional subject attribute specifies the requested subject of the resulting assertion. It may include one or more <saml:SubjectConfirmation> elements to indicate how and/or by whom the resulting assertions can be confirmed. When subject attribute is absent the presenter of the message is presumed to be the requested subject. When no <saml:SubjectConfirmation> elements are included, then the presenter is presumed to be the only attesting entity required.
  • <NameIDPolicy>: It specifies the constraints on the name identifier (e.g. name idenfier format for the URI reference) to be used to represent the requested subject. If omitted any type of identifier supported by the identity provider for the requested subject can be used.
  • <saml:Conditions>: Specifies the SAML conditions the requester expects to limit the validity and/or use of the resulting assertion(s).
  • <RequestedAuthnContext>: Specifies the requirements the requester places on the authentication context that applies to the responding provider's authentication of the presenter.
  • <Scoping>: It specifies a set of identity providers trusted by the requester to authenticate the presenter, as well as limitations and context related to proxying of the <AuthnRequest> message to subsequent identity providers by the responder.
  • ForceAuthn: When "true" the identity provider must authenticate the presenter directly rather than rely on a previous security context. Default value is false.
  • IsPassive: When "true" the identity provider and the user agent itself must not visibly take control of the user interface from the requester and interact with the presenter. Default value is false.
  • AssertionConsumerServiceIndex: Indirectly identifies the location to which the <Response> message should be returned to the requester. It applies only to profiles in which the requester is different from the presenter. When omitted the identity provider returns the <Response> message to the default location associated with the requester for the profile of use. It is mutually exclusive with the AssertionConsumerServiceURL and ProtocolBinding attributes.
  • AssertionConsumerServiceURL: Specifies by value the location to which the <Response> message should be returned to the requester. The responder ensures the value specified is associated with the requester usually by signing the enclosing <AuthnRequest> message is another.
  • ProtocolBinding: A URI reference that identifies a SAML protocol binding to be used when returning the <Response> message.
  • AttributeConsumingServiceIndex: Indirectly identifies information associated with the requester describing the SAML attributes the requester desires or requires to be supplied by the identity provider in the <Response> message.
  • ProviderName: Human-readable name of the requester used by the presenter's user agent or the identity provider.

Artifact Resolution Protocol
The Artifact Resolution Protocol provides the mechanism to transport SAML protocol messages in a SAML binding by reference instead of by value. The requests and responses can be obtained by reference using the protocol. A message sender sends a small piece of data called an artifact using the binding instead of binding a message to a transport protocol. Its mainly used when the bindings is unable to carry the message due to size constraints or usage of secure channel without signature. The <ArtifactResolve> message is used to request that a SAML protocol message be returned in an <ArtifactResponse> message by specifying an artifact that represents the SAML protocol message.The <ArtifactResolve> message is either signed or protected by the protocol binding used to deliver the message. ArtifactResolve element extends RequestAbstractType and adds <Artifact> value that the requester received and now wishes to translate into the protocol message it represents. If the responder recognizes the artifact as valid, then it responds with the associated protocol message in an <ArtifactResponse> message element else the response has no embedded message.

Protocol Bindings
Mappings of SAML request-response message exchanges onto standard messaging or communication protocols are called SAML protocol bindings. It is a mapping of SAML messages to a representation that can be transmitted by an HTTP client over the network interface. All bindings must use HTTP with Secure Sockets Layer (SSL) or Transport Layer Security (TLS). SAML also offers mechanisms for parties to authenticate to one another, but in addition SAML may use other authentication mechanisms to provide security for SAML itself especially when the message passes through an intermediary channels.

RelayState
Some bindings define a "RelayState" mechanism for preserving and conveying state information. The RelayState parameter is used to restore the original application URL so that the user can return to the application with a SAML assertion. Exposing the application URL in SAML messages can be a security risk. For service provider-initiated SSO, the service provider saves the URL and places the name of the cookie in the relay state. For identity provider-initiated SSO this option is not available. Instead we have the identity provider place an alias for the application in the relay state and map the alias to the application on the service provider. RelayState is a parameter used by some SAML protocol implementations to identify the specific resource at the resource provider in an Identity Provider initiated single sign on scenario.

SAML SOAP Binding
SOAP is a lightweight protocol which uses XML technologies to define an extensible messaging framework providing a message construct that can be exchanged over a variety of underlying protocols. A SOAP message is fundamentally a one-way transmission between SOAP nodes from a SOAP sender to a SOAP receiver, possibly routed through one or more SOAP intermediaries. SOAP defines an XML message envelope that includes header and message body sections, allowing data and control information to be transmitted. SAML request-response protocol elements are enclosed within the SOAP message body. SAML messages can be transported using SOAP without re-encoding from the standard SAML schema to one based on the optional SOAP encoding system. A single SOAP message does not have more than one SAML request or response element or any additional XML elements in the SOAP body. SAML SOAP request may have arbitrary headers but does not require any headers to process SAML messages. The SAML message are not cached by the HTTP proxies using the Cache-Control and Pragma headers. The SOAP processing error is returned with a <SOAP-ENV:fault> element, while the SAML error is returned with the <samlp:Status> element within the SOAP body.

Example SOAP Request
 POST /SamlService HTTP/1.1
 Host: www.example.com
 Content-Type: text/xml
 Content-Length: nnn
 SOAPAction: http://www.oasis-open.org/committees/security
 
    
       
           ... 
          
            ...
          
       
    
 

Example SOAP Response
 HTTP/1.1 200 OK
 Content-Type: text/xml
 Content-Length: nnnn
 
   
     
       https://www.example.com/SAML
        ... 
       
       
       
       
         
           ...
         
         
           ...
         
       
     
   
 

HTTP Redirect Binding
SAML protocol messages are transmitted within URL parameters using the HTTP Redirect binding. The XML messages on URL are encoded using specialized URL encodings and transmitted using the HTTP GET method. While the complex message content are sent using HTTP POST or Artifact bindings. Binding endpoints indicate the encodings which they support using the metadata. A URL encoding places the message entirely within the URL query string, and reserves the rest of the URL for the endpoint of the message recipient. A SAMLEncoding query string parameter named is used to identify the encoding mechanism used. When the SAMLEncoding parameter is omitted, then the default value is urn:oasis:names:tc:SAML:2.0:bindings:URL-Encoding:DEFLATE, i.e. DEFLATE encoding which is supported by all endpoints.
      Before applying the DEFLATE compression mechanism to the entire XML content of the original SAML protocol message,  any signature on the SAML protocol message, including the <ds:Signature> XML element is removed. The compressed data is later base64-encoded with linefeeds and whitespaces removed. The base-64 encoded data is then URL-encoded, and added to the URL as SAMLRequest or SAMLResponse query string parameter based on whether the message is SAML request or response. RelayState is included with a SAML protocol message transmitted using HTTP redirect binding. The RelayState data is URL-encoded and placed in an additional RelayState query string parameter. The value of relaystate does not exceed 80 bytes in length and its validity is verified using a checksum with a pseudo-random value. The responder sends the RelayState parameter in the SAML protocol response. If the original SAML protocol message was signed with an XML digital signature then the URL-encoded form of the message is signed. An additional query string parameter SigAlg is included identifying the signature algorithm used to sign the URL-encoded SAML protocol message. Signature is constructed by concatenating RelayState if present, along with SigAlg and SAMLRequest (or SAMLResponse) query parameters ordered as SAMLRequest=value&RelayState=value&SigAlg=value. The resulting octet string is fed into the signature algorithm. The signature is then encoded using the base64 encoding with any whitespace removed, and included as a query string parameter named Signature. The supported signature algorithms are DSAwithSHA1 and RSAwithSHA1, with their URI representations supported with the encoding mechanism. The order of the query string parameters on the resulting URL while verifying signatures varies depending upon implementation. The URL encoding is not canonical and there are multiple encodings for a given value, hence the relying party performs the verification step using the original URL-encoded values it received on the query string. Sample SAML URL with signature is https://idp.com?SAMLResponse=xxxx&SigAlg=xxxx&Signature=xxxx
     When the message is signed, the Destination XML attribute in the root SAML element of the message contains the URL to which the sender has instructed the user agent to deliver the message.  The recipient verifies that the value matches with the location at which the message has been received.

Below are the HTTP request response message exchanges using the HTTP redirect binding.
  • When the user agent first makes an arbitrary HTTP request to a system entity, the system entity decides to initiate a SAML protocol exchange inorder to process the request.
  • The system entity acting as a SAML requester responds to the HTTP request from the user agent by returning a SAML request. The SAML request is returned encoded into the HTTP response's Location header with HTTP status 303 or 302. The user agent delivers the SAML request by issuing an HTTP GET request to the SAML responder.
  • The SAML responder may respond to the SAML request by immediately returning a SAML response or might return arbitrary content to facilitate subsequent interaction with the user agent necessary to fulfill the request.
  • The responder returns a SAML response to the user agent in a similar way as SAML requester responds to the HTTP request. The SAML response is then returned to the SAML requester.
  • Upon receiving the SAML response, the SAML requester returns an arbitrary HTTP response to the user agent.

If the signature and assertion are valid, the service provider establishes a session for the user and redirects the browser to the target resource.

HTTP POST Binding
SAML protocol messages can be transmitted within the base64-encoded content of an HTML form control using HTTP POST binding. It is used when the communicating parties do not share a direct path of communication and, SAML requester or responder need to communicate using an HTTP user agent as an intermediary. Also when the responder requires interactions with the user agent to fulfill the request, HTTP POST binding is used. XML Messages with this binding are encoded into an HTML form control and are transmitted using the HTTP POST method. A SAML protocol message is form-encoded by applying the base-64 encoding rules to the XML representation of the message and placing the result in a hidden form control within a form. Based on the message, SAML request or SAML response, the form control is named SAMLRequest or SAMLResponse respectively. RelayState is optional and included as RelayState hidden form control has maximum length of 80 bytes. The action attribute of the form is the recipient's HTTP endpoint to which the SAML message is delivered, with the method attribute being "POST". All the form control values are transformed to be included in an HTML document.
      The intermediary user agent prevents to rely on the transport layer for end-end authentication. Hence SAML enables Form-encoded messages to be signed before applying base64 encoding. When the message is signed, the Destination XML attribute in the root SAML element contains the URL to which the sender instructed the user agent to deliver the message. The recipient verifies that the value matches the location at which the message has been received. The individual "RelayState" and SAML message values can be integrity protected, but not the combination.


Below are the HTTP request response message exchanges using the HTTP POST binding.
  • When the user agent makes an arbitrary HTTP request to a system entity, the system entity initiates a SAML protocol exchange.
  • The system entity acting as a SAML requester responds to an HTTP request from the user agent by returning a SAML request. The user agent delivers the SAML request by issuing an HTTP POST request to the SAML responder.
  • The SAML responder responds to the SAML request by immediately returning a SAML response or returns arbitrary content to facilitate subsequent interaction with the user agent necessary to fulfill the request.
  • The responder finally returns a SAML response to the user agent to be returned to the SAML requester.
  • Upon receiving the SAML response, the SAML requester returns an arbitrary HTTP response to the user agent.

HTTP Artifact Binding
In the HTTP Artifact binding, the SAML request or the SAML response, or both are transmitted by reference using a small stand-in called an artifact. A separate, synchronous binding, such as the SAML SOAP binding, is used to exchange the artifact for the actual protocol message using the artifact resolution protocol defined in the SAML assertions and protocols specification. The artifact binding can be composed with HTTP Redirect binding to transmit request and response messages in a single protocol exchange using two different bindings. Since the artifact binding is resolved using another synchronous binding, a direct communication path must exist between the SAML message sender and recipient in the reverse direction of the artifact's transmission. The URL parameter encoding or the HTML form control are used for artifact message encoding. RelayState with value upto 80 bytes can be included with a SAML artifact transmitted using artifact binding. The general format of an artifact includes a mandatory two-byte artifact type code (TypeCode) and a two-byte index value identifying a specific endpoint of the artifact resolution service of the issuer (EndpointIndex). Each issuer is assigned an identifying URI, also known as the issuer's entity. The artifact type code contains a 20 byte SourceID which is used by the artifact receiver to determine artifact issuer identity and the set of possible resolution endpoints maintained by destination site.

Below are the HTTP request response message exchanges using the HTTP Artifact binding.
  • When the user agent makes an arbitrary HTTP request to a system entity, the system entity decides to initiate a SAML protocol exchange.
  • The system entity acting as a SAML requester responds to an HTTP request from the user agent by returning an artifact representing a SAML request. If URL-encoded, the artifact is returned encoded into the HTTP response's Location header, and the HTTP status is either 303 or 302. If form-encoded, then the artifact is returned in an XHTML document containing the form and content. The user agent delivers the artifact by issuing either an HTTP GET or POST request to the SAML responder.
  • The SAML responder determines the SAML requester by examining the artifact and issues a <samlp:ArtifactResolve> request containing the artifact to the SAML requester using a direct SAML binding, thus temporarily reversing roles.
  • The SAML requester returns a <samlp:ArtifactResponse> containing the original SAML request message it wishes the SAML responder to process.
  • The SAML responder responds to the SAML request by either immediately returning a SAML artifact or returning an arbitrary content to facilitate subsequent interaction with the user agent necessary to fulfill the request. 
  • The responder finally returns a SAML artifact to the user agent to be returned to the SAML requester. The SAML requester determines the SAML responder by examining the artifact, and issues a <samlp:ArtifactResolve> request containing the artifact to the SAML responder using a direct SAML binding.
  • The SAML responder returns a <samlp:ArtifactResponse> containing the SAML response message it wishes the requester to process.
  • Upon receiving the SAML response, the SAML requester returns an arbitrary HTTP response to the user agent.

SAML URI Binding
The SAML URI Binding supports the encapsulation of a <samlp:AssertionIDRequest> message with a single <saml:AssertionIDRef> into the resolution of a URI. URI resolution can occur over multiple underlying transports mostly HTTP with SSL 3.0. A SAML URI reference identifies a specific SAML assertion. The result of resolving the URI is a message containing the assertion, or a transport-specific error. The specific format of the message depends on the underlying transport protocol. If the transport protocol permits the returned content to be described, then the assertion may be encoded in a custom format. When the same URI reference is resolved in the future, then either the same SAML assertion, or an error, is returned. The SAML reference should consistently reference the same SAML assertion.

Thursday, January 14, 2016

Functional Programming in Scala

Functional programming (FP) is based on a simple premise that we only use pure functions with no side effects in programs which have far-reaching implications. A pure function is one that lacks side effects. A side effect is something which function does other than returning the result such as Modifying a variable or field in object, or writing to an external file.

A function f with input type A and output type B (written in Scala as a single type: A => B, pronounced “A to B” or “A arrow B”) is a computation that relates every value a of type A to exactly one value b of type B such that b is determined solely by the value of a. Any changing state of an internal or external process is irrelevant to computing the result f(a). Hence when a function has no observable effect on the execution of the program other than to compute a result given its inputs, then we say that it has no side effects. For example, a function intToString having type Int => String will take every integer to a corresponding string. A pure function is modular and composable as it separates the logic of the computation itself from “what to do with the result” and “how to obtain the input”. Such computation logic is reusable with no side effects.

Referential transparency and purity: An expression e is referentially transparent (RT) if, for all programs p, all occurrences of e in p can be replaced by the result of evaluating e without affecting the meaning of p. A function f is pure if the expression f(x) is referentially transparent for all referentially transparent x. In other words, an expression to be referentially transparent—in any program, when the expression can be replaced by its result without changing the meaning of the program. Referential transparency forces the invariant that everything a function does is represented by the value that it returns, according to the result type of the function. When expressions are referentially transparent the computation proceeds using substitution model were at each step we replace a term with an equivalent one.

Scala Basics
  • The class keyword introduces a class and contains the body within the curly braces. 
  • A method of a class is introduced by the def keyword which is followed by the name of the method, followed by the parameter list in parentheses and return type. The body of the method itself comes after a single equals sign. 
    def functionName ([list of parameters]) : [return type] = { }
  • Methods are implicitly declared abstract if the equals sign and method body is missing from the method declaration.
  • Scala uses the syntax keyword var to declare a variable, while uses the keyword val to declare a constant. The value of the constant declared using val cannot be changed hence called immutable variable.
  • The type of a variable is specified after the variable name with colon in between, and before equals sign (e.g. var sum:Int = 0). Variable may or may not have initial value during declaration.
  • Scala compiler can determine the type of the variable based on the initial value assigned to the variable, which is called as variable type inference.
  • In Scala statements are separated by newlines or by semicolons. Newlines delimit statements in a block.
  • The last statement within the block is automatically returned, without specifying the "return" keyword. The value returned from a method is simply whatever value results from evaluating the right-hand side. Scala compiler can infer the return types of methods based on the last statement but is considered bad style.
  • A pair of values can be returned by the method indicated with type enclosed within braces. A pair can be created by putting the values in parentheses separated by a comma.
    def buyCoffee(cc: CreditCard): (Coffee, Charge) = { .. }
  • A function, which does not return anything (called procedures), can return Unit which is equivalent to void in Java and indicates that function does not return anything.
  • Private methods cannot be called from the code outside the owning object.
  • A multi-line string literal is a sequence of characters enclosed in triple quotes """ ... """. The sequence of characters is arbitrary, except that it may contain three or more consuctive quote characters only at the very end.
  • Scala allows the last parameter to a function to be repeated indicated by '*' following the type, e.g. "String*" which actually is Array[String].
       def main(args: Array[String]) {
            printStrings("Hello", "Scala", "Python");
       }
    
       def printStrings( args:String* ) = {
          var i : Int = 0; 
          for( arg <- args ) {
             println("Arg value[" + i + "] = " + arg );
             i = i + 1;
          }
       }
    
  • Function parameters can have default values. The argument for such a parameter can optionally be omitted from a function call, in which case the corresponding argument will be filled in with the default.
       def main(args: Array[String]) {
            println( "Returned Value : " + addInt() );
       }
    
       def addInt( a:Int=5, b:Int=7 ) : Int = {
          var sum:Int = 0
          sum = a + b
    
          return sum
       }
  • In a normal function call, the arguments in the call are matched one by one in the order of the parameters of the called function. Named arguments allows to pass arguments to a function in a different order were each argument is preceded by a parameter name and an equals sign as below.
       def main(args: Array[String]) {
            printInt(b=5, a=7);
       }
    
       def printInt( a:Int, b:Int ) = {
          println("Value of a : " + a );
          println("Value of b : " + b );
       }
    
  • A class has a primary constructor whose argument list comes after the class name. The parameters in this list become public, unmodifiable (immutable) fields of the class and can be accessed using the usual object-oriented dot notation.
  • A class can be instantiated without the keyword new, by just using the class name followed by the list of arguments for its primary constructor.
  • A List[...] is an immutable singly linked list of values. The List.fill(n)(x) creates a List with n copies of x.
  • The unzip method splits a list of pairs into a pair of lists. E.g. List[(Coffee, Charge)] is split by destructuring the pair to declare two values (coffees and charges) on one line.
  • The reduce method reduces the entire list of values into a single value, using the combine method of the value class to combine values two at a time.
        e.g. val (coffees, charges) = purchases.unzip(coffees, charges.reduce((c1,c2) => c1.combine(c2)))
  • The object keyword creates a new singleton type, which is like a class that only has a single named instance similar to anonymous class in java. An object is scala's equivalent to java's static. E.g. object MyModule { ... }
  • Scala looks at the main method with a specific signature, which takes an Array of Strings as its argument, and its return type is Unit, inorder to execute the program. Unit serves a similar purpose to void in java. The literal syntax for unit is (), a pair of empty parentheses. In Scala, every method has to return some value as long as it doesn’t crash or hang.
  • Scala allows to define functions inside a function which are called local functions and are only visible inside the enclosing method.
  • Anonymous functions in scala also called function literals, can have multiple parameters or no parameter. 
        var mul = (x: Int, y: Int) => x*y
        println(mul(3, 4))
    
        var userDir = () => { System.getProperty("user.dir") }
        println( userDir )
    
  • Scala allows to apply functions partially to avoid passing redundant values when a method is invoked multiple times with the same value for a parameter. The constant parameter value can be eliminated by partially applying the argument to the method by binding a value to the constant parameter and leave the other parameters unbound by putting an underscore at their place. The resulting partially applied function is stored in a variable. 
       def main(args: Array[String]) {
         val date = new Date
         val logWithDateBound = log(date, _ : String)
         logWithDateBound("message1" )
       }
    
       def log(date: Date, message: String)  = {
         println(date + "----" + message)
       }
    
  • Currying transforms a function that takes multiple parameters into a chain of functions, each taking a single parameter. Curried functions are defined with multiple parameter lists. 
       def strcat(s1: String)(s2: String) = s1 + s2
       // Alternate Syntax
       def strcat(s1: String) = (s2: String) => s1 + s2
       strcat("foo")("bar")
    
  • In scala, parameters to the functions are usually passed by value but it also has call-by-name parameters, by putting the => symbol between the variable name and the type, which passes an expression to be evaluated within the called function. A call-by-name mechanism passes a code block to the callee and each time the callee accesses the parameter, the code block is executed and the value is calculated.
  • Every Java class is available in Scala. Scala does not have its own String class and uses java.lang.String. Since String class is immutable, StringBuilder should be used while making frequent String modifications.
Every value in Scala is what’s called an object, and each object may have zero or more members. A member can be a method declared with the def keyword, or it can be another object declared with val or object.
Any method name can be used infix, omitting the dot and parentheses when calling it with a single argument. E.g. instead of MyModule.abs(42) we can say MyModule abs 42 and get the same result.
All of an object’s (nonprivate) members can be brought into scope by using the underscore syntax. E.g. import MyModule._
In scala, functions are values and can be assigned to variables, stored in data structures, and passed as arguments to functions. A function that accepts other functions as arguments. This is called a higher-order function (HOF).
In Scala, functions that are local to the body of another function are called an inner function, or local definition. They are used to write loops functionally, without mutating a loop variable, by using a recursive function. Scala detects self-recursion and compiles it to the same sort of bytecode as would be emitted for a while loop,as long as the recursive call is in tail position. A call is said to be in tail position if the caller does nothing other than return the value of the recursive call. If all recursive calls made by a function are in tail position, Scala automatically compiles the recursion to iterative loops that don’t cona function literalsume call stack frames for each iteration. we can tell the Scala compiler about tail call elimination using the tailrec annotation.
def factorial(n: Int): Int = {
   def go(n: Int, acc: Int): Int =
      if (n <= 0) acc
      else go(n-1, n*acc)
   go(n, 1)
}

Example of passing function as the mthod parameter:
def formatResult(name: String, n: Int, f: Int => Int) = { ... }

Polymorphic Function:
A comma-separated list of type parameters, surrounded by square brackets ([A]), following the name of the function is used to define Polymorphic function for various types. The type parameter list introduces type variables that can be referenced in the rest of the type signature.
def findFirst[A](as: Array[A], p: A => Boolean): Int = {
  @annotation.tailrec
  def loop(n: Int): Int =
     if (n >= as.length) -1
     else if (p(as(n))) n
     else loop(n + 1)
        loop(0)
     }
The expression Array(7, 9, 13) is an array literal and it constructs a new array with three integers in it.
A function literal or anonymous function has the arguments to the function declared to the left of the => arrow, while to the right of the arrow is the body of the function were the parameters can be used. Example syntax: (x: Int) => x == 9 and (x: Int, y: Int) => x == y
A function literal under th hood is an object with a method called apply. Scala allows the objects that have a method with special name "apply" can be called as if they were themselves methods. Hence (a, b) => a < b syntactically looks as below, were calling lessThan(10, 20) actually calls the apply method:
val lessThan = new Function2[Int, Int, Boolean] {
   def apply(a: Int, b: Int) = a < b
}

Scala has Function1, Function2, Function3 and other interfaces known as traits provided by the standard Scala library which takes number of arguments indicated by the name. 
Scala’s standard library provides compose as a method on Function1, were two functions f and g can be composed by calling "f compose g". Also f andThen g is the same as g compose f.

A functional data structure is (not surprisingly) operated on using only pure functions. functional data structures are by definition immutable.

Trait
A trait is an abstract interface that may optionally contain implementations of some methods. When sealed is added in front trait it means that all implementations of the trait must be declared in this file. The implementations of the trait is denoted by 'case' keyword. By specifying the type parameter [+A] after sealed trait within the method signature and using inside the constructor, makes the data type to be polymorphic in the type of elements it contains. Hence it can be used with Int, Double, String elements etc. The + indicates that the type parameter A is covariant, hence for all types X and Y, if X is a subtype of Y, then List[X] is a subtype of List[Y]. Nothing is a subtype of all types, which means that in conjunction with the variance annotation, Nil can be considered a List[Int], a List[Double] etc.

sealed trait List[+A]
case object Nil extends List[Nothing]
case class Cons[+A](head: A, tail: List[A]) extends List[A]

Pattern Matching
Pattern matching works similar to switch statement which may descend into the structure of the expression it examines and extract subexpressions of that structure. It’s introduced with an expression (the target or scrutinee) like ints as below, followed by the keyword match, and a {}-wrapped sequence of cases. Each case in the match consists of a pattern (like Cons(x,xs)) to the left of the => and a result (like x + sum(xs)) to the right of the =>. If the target matches the pattern in a case, the result of that case becomes the result of the entire match expression. If multiple patterns match the target, Scala chooses the first matching case. The variable pattern '_' is generally used to match any expression, e.g. List(1,2,3) match { case Cons(h,_) => h } results in 1 as List(1,2,3) is equals to Cons(1, Cons(2, Cons(3, Nil))). A pattern matches the target if there exists an assignment of variables in the pattern to subexpressions of the target that make it structurally equivalent to the target. The resulting expression for a matching case will then have access to these variable assignments in its local scope.

def sum(ints: List[Int]): Int = ints match {
   case Nil => 0
   case Cons(x,xs) => x + sum(xs)
}

A companion object in addition to the data type and its data constructors is an object with the same name as the data type where we put various convenience functions for creating or working with values of the data type. For example a function to fill the List data type with n copies of element a. When functions are in the companion object they are called as fn(obj, arg1), while when inside the body of the trait they are called as obj.fn(arg1) or obj fn arg1.

Variadic functions are just providing a little syntactic sugar for creating and passing a Seq of elements explicitly. Seq is the interface in Scala’s collections library implemented
by sequence-like data structures such as lists, queues, and vectors. The special _* type annotation allows us to pass a Seq to a variadic method.

When data is immutable, sharing of immutable data allows to implement functions more efficiently without having to worry about subsequent code modifying the data. Hence there’s no need to pessimistically make copies to avoid modification or corruption. Functional data structures are persistent, meaning that existing references are never changed by operations on the data structure. E.g. Below function that adds all the elements of one list to the end of another:

def append[A](a1: List[A], a2: List[A]): List[A] =
  a1 match {
    case Nil => a2
    case Cons(h,t) => Cons(h, append(t, a2))
}

The takeWhile takes the elements from a list while the predicates is true. dropWhile on the other hand drops the elements while the predicate is true and returns the rest.
val xs: List[Int] = List(1,2,3,4,5)

val ex1 = dropWhile(xs, (x: Int) => x < 4)
// ex1 == List(1,2,3)

val ex2 = dropWhile(xs, (x: Int) => x < 4)
// ex2 == List(4,5)

Scala can infer type arguments for higher-order functions when we group function defination into argument groups. The syntax of calling dropWhile is 'dropWhile(xs)(f)' were dropWhile(xs) is returning a function, which then calls with the argument f as below. Hence more generally, when a function definition contains multiple argument groups, type information flows from left to right across these argument groups.

def dropWhile[A](as: List[A])(f: A => Boolean): List[A] =
   as match {
   case Cons(h,t) if f(h) => dropWhile(t)(f)
   case _ => as
}

val xs: List[Int] = List(1,2,3,4,5)
val ex1 = dropWhile(xs)(x => x < 4)

The functions can be generalized by pulling subexpressions out into function arguments. If a subexpression refers to any local variables e.g. + or * operation, turn the subexpression into a function that accepts these variables as arguments. In below example, the function takes an argument the value to return in the case of the empty list, and the function to add an element to the result in the case of a nonempty list. It essentially replaces the constructors of the list, Nil and Cons, with z and f, were Cons(1, Cons(2, Nil)) becomes f (1, f (2, z )).

def foldRight[A,B](as: List[A], z: B)(f: (A, B) => B): B =
  as match {
     case Nil => z
     case Cons(x, xs) => f(x, foldRight(xs, z)(f))
  }

def sum2(ns: List[Int]) = foldRight(ns, 0)((x,y) => x + y)
def product2(ns: List[Double]) = foldRight(ns, 1.0)(_ * _)

The anonymous function (x,y) => x + y can be written as _ + _ in situations where the types of x and y could be inferred by Scala. Each underscore in an anonymous function expression like _ + _ introduces a new (unnamed) function parameter and references it.
       List is a data type in the Scala standard library and the constructor is called ::, which associates to the right, so 1 :: 2 :: Nil is equal to 1 :: (2 :: Nil), which is equal to List(1,2). Below are few methods defined in List of standard library.
  • def take(n: Int): List[A] — Returns a list consisting of the first n elements of this.
  • def takeWhile(f: A => Boolean): List[A] — Returns a list consisting of the longest valid prefix of this whose elements all pass the predicate f.
  • def forall(f: A => Boolean): Boolean — Returns true if and only if all elements of this pass the predicate f.
  • def exists(f: A => Boolean): Boolean — Returns true if any element of this passes the predicate f.
  • scanLeft and scanRight — Like foldLeft and foldRight, but they return the List of partial results rather than just the final accumulated value.
An algebraic data type is a data type defined by one or more data constructors, each of which may contain zero or more arguments. The data type is the sum or union of its data constructors, and each data constructor is the product of its arguments.

Exception Handling
Scala represents failures and exceptions with ordinary values and uses higher-order functions that abstract out common patterns of error handling and recovery. Exceptions thrown in the functions make the method return value not referentially transparent. RT expressions does not depend on context and may be reasoned about locally, whereas the meaning of non-RT expressions is context-dependent and requires more global reasoning. For example the meaning of the RT expression 42 + 5 doesn’t depend on the larger expression it’s embedded in—it’s always equal to 47. On the other hand the meaning of the expression throw new Exception("fail") is very context-dependent as it takes on different meanings depending on which try block (if any) it’s nested within. Exceptions break RT and introduce context dependence, and should be used only for error handling and not for control flow. Exceptions are not type-safe and the compiler does not know nor can enforce handling unknown exceptions which won’t be detected until runtime. Java’s checked exceptions which force to handle exceptions add a significant boilerplate code and, don’t work for higher-order functions which are generic and can’t possibly be aware of the specific exceptions that could be raised by their arguments. Hence instead of throwing an exception, we return a value indicating that an exceptional condition has occurred and introduce a new generic type with the possible defined values.
    The Option data type is an explicit return type when the function may not have a return value. Option has two cases: it can be defined, in which case it will be a Some, or it can be undefined, in which case it will be None. Option is similar to a List were it can contain at most one element, and many of the List functions have analogous functions on Option.

def mean(xs: Seq[Double]): Option[Double] =
   if (xs.isEmpty) None
   else Some(xs.sum / xs.length)

The list of functions on Option include map, flatmap, getOrElse, orElse and filter as below. The map function can be used to transform the result inside an Option, if it exists or else if None it aborts the remaining operation. flatMap function is similar, except that the function provided to transform the result can itself fail. filter function is used to filter out the relevant values and mostly used within the chain of operations. It can also convert successes into failures if the successful values don’t match the given predicate. getOrElse is used for conversion and for error handling. A common idiom is to do o.getOrElse(throw new Exception("FAIL")) to convert the None case of an Option back to an exception. orElse is similar to getOrElse, except that we return another Option if the first is undefined.

trait Option[+A] {
   def map[B](f: A => B): Option[B]
   def flatMap[B](f: A => Option[B]): Option[B]
   def getOrElse[B >: A](default: => B): B
   def orElse[B >: A](ob: => Option[B]): Option[B]
   def filter(f: A => Boolean): Option[A]
}

The default: => B type annotation in above getOrElse function indicates that the argument is of type B, but won’t be evaluated until it’s needed by the function. Also, the B >: A type parameter indicates that B must be equal to or a supertype of A inorder to convince Scala that it’s still safe to declare Option[+A] as covariant in A.
   The advantage of Option is that we don’t have to check for None at each stage of the computation and can apply several transformations before checking/handling None. Also since Option[A] is a different type than A, the compiler warns to handle the possibility of None instead of defering it.

Ordinary functions can be lifted to operate on Option using lift function were the map turns a function f of type A => B into a function of type Option[A] => Option[B]. Hence any function can be transformed (via lift) to operate within the context of a single Option value. The lift(f) returns a function which maps None to None and applies f to the contents of Some. Here f need not be aware of the Option type at all.

def lift[A,B](f: A => B): Option[A] => Option[B] = _ map f

val absO: Option[Double] => Option[Double] = lift(math.abs)

The Try function below is a general-purpose function which can used to convert from an exception-based API to an Option-oriented API. This uses a non-strict or lazy argument, as indicated by the => A as the type of a. Functions accepting a single argument may be called with braces instead of parentheses, hence Try { age.toInt } is equivalent to Try(age.toInt).
def Try[A](a: => A): Option[A] =
  try Some(a)
  catch { case e: Exception => None }

The map2 function means that we never need to modify any existing functions of two arguments to make them “Option-aware.” We can lift them to operate in the context of Option after the fact.
def map2[A,B,C](a: Option[A], b: Option[B])(f: (A, B) => C):
  Option[C] =
    a flatMap ( aa =>
                b map (bb => f(aa, bb))
     )

map2(optAge, optTickes)(insuranceRateQuote)

Sequence combines a list of Options into one Option containing a list of all the Some values in the original list. If the original list contains None even once, the result of the function is None.
def sequence[A](a: List[Option[A]]): Option[List[A]]

Traverse allows to map elements in the list to Option and return all the Option elements as a combined Option list of values.
def traverse[A, B](a: List[A])(f: A => Option[B]): Option[List[B]]

Scala provides a syntactic construct called the for-comprehension that it expands automatically to a series of flatMap and map calls. A for-comprehension consists of a sequence of bindings, like aa <- a, followed by a yield after the closing brace, where the yield may make use of any of the values on the left side of any previous <- binding. The compiler desugars the bindings to flatMap calls, with the final binding and yield being converted to a call to map.
def map2[A,B,C](a: Option[A], b: Option[B])(f: (A, B) => C):
  Option[C] =
    for {
       aa <- a
       bb <- b
    } yield f(aa, bb)

The Either data type which is an extension to Option allows to track the reason for failure. Either has only two cases were each case carries one value. The Right constructor is reserved for the success case and Left is used for failure.

sealed trait Either[+E, +A]
case class Left[+E](value: E) extends Either[E, Nothing]
case class Right[+A](value: A) extends Either[Nothing, A]

def safeDiv(x: Int, y: Int): Either[Exception, Int] =
  try Right(x / y)
  catch { case e: Exception => Left(e) }

Strictness and laziness
Consider refining a list of data using multiple criterias. This can be achieved by chaining the map and filter functions to evaluate the result using multiple transformations. For each transformation the map and filter each perform their own traversal of the input and allocate temperory lists for the output. These transformations can be fused into a single pass thus avoiding creation of temporary data structures using non-strictness.

Non-strictness is a property of a function. A strict function always evaluates its arguments while a non-strict function does not evaluate one or more of its arguments. In Scala any function definition is strict unless specified otherwise. The Boolean functions && and || which may choose not to evaluate their arguments, are the examples of non-strict funtions. The if statement which is a built-in contruct is a function in Scala which takes 3 arguments, the boolean condition and two clauses to be executed when the condition is true or false. Since the execution of the clauses is optional depending upon the boolean value of the evaluated condition, if functon is non-strict. In Scala a non-strict function takes its arguments by name rather than by value.

def if2[A](cond: Boolean, onTrue: () => A, onFalse: () => A): A =
   if (cond) onTrue() else onFalse()
if2(a < 22,
   () => println("a"),
   () => println("b")
)

A value of type () => A is a function that accepts zero arguments and returns an A, which is an syntactic alias for the type Function0[A]. The unevaluated form of an expression is called a thunk. The thunk can be forced to evaluate the expression and get a result by invoking the function, passing an empty argument list, e.g. onTrue() or onFalse() as above. Overall here we pass a function of no arguments in place of each non-strict parameter, and then explicitly call the function to obtain a result in the body. Scala has common syntax for wrapping the expression in a thunk as below, were the unevaluated arguments are passed with an arrow => before their type. The arguments which then were passed unevaluated to a function will be evaluated once for each place it’s referenced in the body of the function without caching the previously evaluated results. The value can be cached explicitly to only evaluate the result once, by using the lazy keyword, preventing repeated evaluations triggered by subsequent references.

def if2[A](cond: Boolean, onTrue: => A, onFalse: => A): A =
  if (cond) onTrue else onFalse

def maybeTwice2(b: Boolean, i: => Int) = {
  lazy val j = i
  if (b) j+j else 0
}

Strictness: If the evaluation of an expression runs forever or throws an error instead of returning a definite value, we say that the expression doesn’t terminate, or that it evaluates to bottom. A function f is strict if the expression f(x) evaluates to bottom for all x that evaluate to bottom.