Emprovise Blog: 2016

Monday, December 19, 2016

Single Sign On with SAML 2.0

Security Assertion Markup Language or SAML is the secure XML based communication standard for communicating identities, exchanging authentication and authorization data between parties. SAML is a specification which defines messages and their format, message encoding methods, message exchange protocols, and other recommendations. SAML addresses the primary use case of internet single sign on (SSO) which authenticates the user using a single login into the system and allows access to other affiliated systems without additional authentication. SAML thus eliminates multiple authentication credentials in multiple locations and reduces the number of the users being authenticated. It separates the security framework from platform architecture and specific vendor implementation. SAML involves three entities, the user, identity provider and server provider. The Identity Provider maintains a directory of users and an authentication mechanism to authenticate them. The Service Provider is the target application that a user tries to use.

SAML consists of six components as follows: assertions, protocols, bindings, profiles, metadata, authentication context. The components mainly enable to transfer secure information like identity, authentication, and authorization information between trusted entities.

SAML assertions contain identifying information made by a SAML authority. In SAML, there are three assertions: authentication, attribute, and authorization. Authentication assertion validates that the specified subject is authenticated by a particular means at a particular time and is made by a SAML authority called an identity provider. Attribute assertion contains specific information about the specified subject. And authorization assertion identifies what the specified subject is authorized to do.
SAML protocols define how SAML asks for and receives assertions and the structure and contents of SAML protocols are defined by the SAML-defined protocol XML schema.
SAML bindings define how SAML request-response message exchanges are mapped to communication protocols like Simple Object Access Protocol (SOAP). SAML works with multiple protocols including Hypertext Transfer Protocol (HTTP), Simple Mail Transfer Protocol (SMTP), File Transfer Protocol (FTP) and so on.
SAML profiles define constraints and/or extensions to satisfy the specific use case of SAML. For example, the Web SSO Profile details how SAML authentication assertions are exchanged between entities and what the constraints of SAML protocols and binding are. An attribute profile on the other hand establishes specific rules for interpretation of attributes in SAML attribute assertions. For instance, X.500/LDAP profile details how to carry X.500/LDAP attributes within SAML attribute assertions.
SAML metadata defines a way to express and share configuration information between SAML entities. For instance, an entity's supported SAML bindings, operational roles (IDP, SP, etc), identifier information, supporting identity attributes, and key information for encryption and signing can be expressed using SAML metadata XML documents. SAML Metadata is defined by its own XML schema. In a number of situations, a service provider may need to have detailed information regarding the type and strength of authentication that a user employed when they authenticated at an identity provider.
SAML authentication context is used in (or referred to from) an assertion's authentication statement to carry this information. A service provider can also include an authentication context in a request to an identity provider to request that the user be authenticated using a specific set of authentication requirements, such as a multi-factor authentication.

SAML Assertion
An assertion is a package of information that supplies zero or more statements made by a SAML authority usually about a subject such as a user. SAML assertions are issued from the Identity Provider(also called Asserting Party) to the Service Provider (also called as Relying Party). When the user has authenticated with the Identity Provider a SAML Assertion is sent to the Service Provider with the Identity Provider's information about that user. It is represented by the <Subject> element. SAML specification defines three different kinds of assertion statements that can be created by a SAML authority. All SAML-defined statements are associated with a subject. The three kinds of statement defined in the specification are:

Authentication: The subject was authenticated by a particular means at a particular time.
Attribute: The subject is associated with the supplied attributes with values mostly using LDAP.
Authorization Decision: A request to allow the subject to access the specified resource has been granted or denied using the given evidence.

SAML authentication request protocol enables third-party authentication of a subject. It is useful in the cases to limit the scope within which an identifier is used to a small set of system entities.

Name Identifiers
Name Identifiers are identifiers for subjects and the issuers of assertions and protocol messages. They help to establish a means by which parties may be associated with identifiers that are meaningful to each of the parties. They help to limit the scope within which an identifier is used to a small set of system entities. Two or more system entities may use the same name identifier value when referring to different identities. SAML provides name qualifiers to disambiguate a name identifier by effectively placing it in a federated namespace related to the name qualifiers. The <BaseID> element is an extension point that allows applications to add new kinds of identifiers. The NameIDType complex type is used when an element serves to represent an entity by a string-valued name. Its more restricted form of identifier than the <BaseID> element and is the type underlying both the <NameID> and <Issuer> elements. The <NameID> element is of type NameIDType, and is used in various SAML assertion constructs such as the <Subject> and <SubjectConfirmation> elements, and in various protocol messages. The <EncryptedID> element is of type EncryptedElementType, and carries the content of an unencrypted identifier element in encrypted fashion. The <Issuer> element, with complex type NameIDType, provides information (name etc) about the issuer of a SAML assertion or protocol message.

Assertions has following elements:

The <AssertionIDRef> element makes a reference to a SAML assertion by its unique identifier. The specific authority who issued the assertion or from whom the assertion can be obtained is not specified as part of the reference.
The <AssertionURIRef> element makes a reference to a SAML assertion by URI reference.
The <Assertion> element is of the AssertionType complex type and specifies the basic information common to all assertions, such as version issual time, identifier, issuer, signature, subject, authentication etc.
The SAML assertion MAY be signed adding the <ds:Signature> element, which provides both authentication of the issuer and integrity protection.
The <EncryptedAssertion> element represents an assertion in encrypted fashion.

The Subjects section defines the SAML constructs used to describe the subject of an assertion. The optional <Subject> element specifies the principal that is the subject of all of the (zero or more) statements in the assertion. It identifies the subject using <BaseID>, <NameID>, or <EncryptedID> and confirms it using <SubjectConfirmation> element. A <Subject> element can contain both an identifier and zero or more subject confirmations which a relying party (service provider) can verify when processing an assertion. A <Subject> element SHOULD NOT identify more than one principal.

The <SubjectConfirmation> element provides the means for a relying party (service provider) to verify the correspondence of the subject of the assertion with the party with whom the relying party is communicating. It has the method which identifies a protocol or mechanism to be used to confirm the subject.
The <SubjectConfirmationData> element has specifies additional data that allows the subject to be confirmed or constrains the circumstances under which the act of subject confirmation can take place. The KeyInfoConfirmationDataType complex type constrains a <SubjectConfirmationData> element to contain one or more <ds:KeyInfo> elements that identify cryptographic keys that are used in some way to authenticate an attesting entity.

The <Conditions> element place constraints on the acceptable use of SAML, such as Validity, Audience Restriction, Usage and Proxy Restrictions.
The <Advice> element contains any additional information that the SAML authority wishes to provide to a relying party (service provider).
The <Statement> element is an extension point that allows other assertion-based applications to reuse the SAML assertion framework.
The <AuthnStatement> element describes a statement by the SAML authority asserting that the assertion subject was authenticated by a particular means at a particular time.
The <SubjectLocality> element specifies the DNS domain name and IP address for the system from which the assertion subject was authenticated.
The <AuthnContext> element specifies the context of an authentication event and can contain authentication context class reference, an authentication context declaration or declaration reference,
or both.
The <AttributeStatement> element describes a statement by the SAML authority asserting that the assertion subject is associated with the specified attributes. Assertions containing <AttributeStatement> elements MUST contain a <Subject> element. It has <Attribute>, <AttributeValue>, <EncryptedAttribute> and <AuthzDecisionStatement> elements.

SAML protocol messages can be generated and exchanged using a variety of protocols. The protocols defined by SAML achieve the following actions:

Returning one or more requested assertions. This can occur in response to either a direct request for specific assertions or a query for assertions that meet particular criteria.
Performing authentication on request and returning the corresponding assertion.
Registering a name identifier or terminating a name registration on request.
Retrieving a protocol message that has been requested by means of an artifact.
Performing a near-simultaneous logout of a collection of related sessions (“single logout”) on request.
Providing a name identifier mapping on request.

RequestAbstractType
All SAML requests are of types that are derived from the abstract RequestAbstractType complex type. It has following attributes:

ID: An unique identifier required for the request.
Version: The version of this request.
IssueInstant: The time instant of issue of the request encoded in UTC.
Destination: An optional URI reference indicating the address to which this request has been sent in order to prevent malicious forwarding of requests to unintended recipients.
Consent: An optional indicator which indicates that the consent has been obtained from a principal in the sending of this request.
Issuer: An optional <saml:Issuer> identifies the entity that generated the request message.
Signature: An optional <ds:Signature> that authenticates the requester and provides message integrity.
Extensions: The extension point contains optional protocol message extension elements that are agreed on between the communicating parties.

StatusResponseType
All SAML responses are of types that are derived from the StatusResponseType complex type. It has following attributes:

ID: An unique identifier required for the response.
InResponseTo: A reference to the identifier of the request to which the response corresponds.
Version: The version of this response.
IssueInstant: The time instant of issue of the response encoded in UTC.
Destination: An optional URI reference indicating the address to which this response has been sent in order to prevent malicious forwarding of responses to unintended recipients.
Consent: An optional indicator which indicates that the consent has been obtained from a principal in the sending of this response.
Issuer: An optional <saml:Issuer> identifies the entity that generated the response message.
Signature: An optional <ds:Signature> that authenticates the responder and provides message integrity.
Extensions: The extension point contains optional protocol message extension elements that are agreed on between the communicating parties.
Status: The <Status> is the required code representing the status of the corresponding request. It contains <StatusCode> representing the status of the activity, <StatusMessage> and <StatusDetail> having additional information concerning the status of the request. The status code can be successful with code as "urn:oasis:names:tc:SAML:2.0:status:Success" or varying failure codes as in the specification.

The existing assertions can be requested by uniquely identified reference using <AssertionIDRequest> or queried for assertions by subject or statement type using the <SubjectQuery>, <AuthnQuery>, <RequestedAuthnContext>, <AttributeQuery> and <AuthzDecisionQuery> elements.

The <Response> message element which is an extension of StatusResponseType is used when a response consists of a list of zero or more assertions that satisfy the request. It has additional <saml:Assertion> or <saml:EncryptedAssertion> which specifies an assertion or encrypted assertion by value. In response to a SAML-defined query message, every assertion returned by a SAML authority must contain a <saml:Subject> element that matches the <saml:Subject> element found in the query. The identifier element (<BaseID>, <NameID>, or <EncryptedID>) or at least one <saml:SubjectConfirmation> element must match between the <saml:Subject> elements of the query and its response.

Authentication Request Protocol
When a principal wants to obtain assertions containing authentication statements to establish a security context at one or more relying parties, it uses the authentication request protocol to send an <AuthnRequest> message element to a SAML authority. The returned <Response> message which contains one or more assertions must have at least one assertion which contains at least one authentication statement. Initially the Requester creates the authentication request. The Presenter then sends the <AuthnRequest> message providing the properties required for resulting assertion to the identity provider and either authenticates itself to the identity provider or relies on an existing security context to establish its identity. The process of authentication of the presenter may take place before, during, or after the initial delivery of the <AuthnRequest> message. The <AuthnRequest> message is mostly signed or authenticated by the protocol binding used to deliver the message. An Identity Provider provides identifiers for users looking to interact with a system, and issues an assertion along with an authentication statement. The request presenter ideally is the attesting entity which satisfies the subject confirmation requirements within the <SubjectConfirmation> elements of the resulting assertion. The responder replies to an <AuthnRequest> with a <Response> message either containing one or more assertions meeting the specifications defined by the request or a <Status> describing the error occurred. The presenter can be directed to another identity provider by the responder while issuing its own <AuthnRequest> message, so that the resulting assertion can be used to authenticate the presenter to the original responder. The returned assertion(s) contains a <saml:Subject> element which represents the presenter. The identifier type and format are determined by the identity provider. At least one statement in at least one assertion is a <saml:AuthnStatement> which describes the authentication performed by the responder or authentication service associated with it. The resulting assertion(s) also contains a <saml:AudienceRestriction> element referencing the requester as an acceptable relying party (service provider). The Relying Party consumes the assertion(s) to establish a security context and to authenticate or authorize the requested subject in order to provide a service. Identity provider may skip the creation of a new <AuthnRequest> for the authenticating identity provider, when authenticating the same presenter for a second requester.

The AuthnRequest element extends from RequestAbstractType and adds the below additional attributes.

<saml:Subject>: The optional subject attribute specifies the requested subject of the resulting assertion. It may include one or more <saml:SubjectConfirmation> elements to indicate how and/or by whom the resulting assertions can be confirmed. When subject attribute is absent the presenter of the message is presumed to be the requested subject. When no <saml:SubjectConfirmation> elements are included, then the presenter is presumed to be the only attesting entity required.
<NameIDPolicy>: It specifies the constraints on the name identifier (e.g. name idenfier format for the URI reference) to be used to represent the requested subject. If omitted any type of identifier supported by the identity provider for the requested subject can be used.
<saml:Conditions>: Specifies the SAML conditions the requester expects to limit the validity and/or use of the resulting assertion(s).
<RequestedAuthnContext>: Specifies the requirements the requester places on the authentication context that applies to the responding provider's authentication of the presenter.
<Scoping>: It specifies a set of identity providers trusted by the requester to authenticate the presenter, as well as limitations and context related to proxying of the <AuthnRequest> message to subsequent identity providers by the responder.
ForceAuthn: When "true" the identity provider must authenticate the presenter directly rather than rely on a previous security context. Default value is false.
IsPassive: When "true" the identity provider and the user agent itself must not visibly take control of the user interface from the requester and interact with the presenter. Default value is false.
AssertionConsumerServiceIndex: Indirectly identifies the location to which the <Response> message should be returned to the requester. It applies only to profiles in which the requester is different from the presenter. When omitted the identity provider returns the <Response> message to the default location associated with the requester for the profile of use. It is mutually exclusive with the AssertionConsumerServiceURL and ProtocolBinding attributes.
AssertionConsumerServiceURL: Specifies by value the location to which the <Response> message should be returned to the requester. The responder ensures the value specified is associated with the requester usually by signing the enclosing <AuthnRequest> message is another.
ProtocolBinding: A URI reference that identifies a SAML protocol binding to be used when returning the <Response> message.
AttributeConsumingServiceIndex: Indirectly identifies information associated with the requester describing the SAML attributes the requester desires or requires to be supplied by the identity provider in the <Response> message.
ProviderName: Human-readable name of the requester used by the presenter's user agent or the identity provider.

Artifact Resolution Protocol
The Artifact Resolution Protocol provides the mechanism to transport SAML protocol messages in a SAML binding by reference instead of by value. The requests and responses can be obtained by reference using the protocol. A message sender sends a small piece of data called an artifact using the binding instead of binding a message to a transport protocol. Its mainly used when the bindings is unable to carry the message due to size constraints or usage of secure channel without signature. The <ArtifactResolve> message is used to request that a SAML protocol message be returned in an <ArtifactResponse> message by specifying an artifact that represents the SAML protocol message.The <ArtifactResolve> message is either signed or protected by the protocol binding used to deliver the message. ArtifactResolve element extends RequestAbstractType and adds <Artifact> value that the requester received and now wishes to translate into the protocol message it represents. If the responder recognizes the artifact as valid, then it responds with the associated protocol message in an <ArtifactResponse> message element else the response has no embedded message.

Protocol Bindings
Mappings of SAML request-response message exchanges onto standard messaging or communication protocols are called SAML protocol bindings. It is a mapping of SAML messages to a representation that can be transmitted by an HTTP client over the network interface. All bindings must use HTTP with Secure Sockets Layer (SSL) or Transport Layer Security (TLS). SAML also offers mechanisms for parties to authenticate to one another, but in addition SAML may use other authentication mechanisms to provide security for SAML itself especially when the message passes through an intermediary channels.

RelayState
Some bindings define a "RelayState" mechanism for preserving and conveying state information. The RelayState parameter is used to restore the original application URL so that the user can return to the application with a SAML assertion. Exposing the application URL in SAML messages can be a security risk. For service provider-initiated SSO, the service provider saves the URL and places the name of the cookie in the relay state. For identity provider-initiated SSO this option is not available. Instead we have the identity provider place an alias for the application in the relay state and map the alias to the application on the service provider. RelayState is a parameter used by some SAML protocol implementations to identify the specific resource at the resource provider in an Identity Provider initiated single sign on scenario.

SAML SOAP Binding

SOAP is a lightweight protocol which uses XML technologies to define an extensible messaging framework providing a message construct that can be exchanged over a variety of underlying protocols. A SOAP message is fundamentally a one-way transmission between SOAP nodes from a SOAP sender to a SOAP receiver, possibly routed through one or more SOAP intermediaries. SOAP defines an XML message envelope that includes header and message body sections, allowing data and control information to be transmitted. SAML request-response protocol elements are enclosed within the SOAP message body. SAML messages can be transported using SOAP without re-encoding from the standard SAML schema to one based on the optional SOAP encoding system. A single SOAP message does not have more than one SAML request or response element or any additional XML elements in the SOAP body. SAML SOAP request may have arbitrary headers but does not require any headers to process SAML messages. The SAML message are not cached by the HTTP proxies using the Cache-Control and Pragma headers. The SOAP processing error is returned with a <SOAP-ENV:fault> element, while the SAML error is returned with the <samlp:Status> element within the SOAP body.

Example SOAP Request

 POST /SamlService HTTP/1.1
 Host: www.example.com
 Content-Type: text/xml
 Content-Length: nnn
 SOAPAction: http://www.oasis-open.org/committees/security
 
    
       
           ... 
          
            ...

Example SOAP Response

 HTTP/1.1 200 OK
 Content-Type: text/xml
 Content-Length: nnnn
 
   
     
       https://www.example.com/SAML
        ... 
       
       
       
       
         
           ...
         
         
           ...

HTTP Redirect Binding

SAML protocol messages are transmitted within URL parameters using the HTTP Redirect binding. The XML messages on URL are encoded using specialized URL encodings and transmitted using the HTTP GET method. While the complex message content are sent using HTTP POST or Artifact bindings. Binding endpoints indicate the encodings which they support using the metadata. A URL encoding places the message entirely within the URL query string, and reserves the rest of the URL for the endpoint of the message recipient. A SAMLEncoding query string parameter named is used to identify the encoding mechanism used. When the SAMLEncoding parameter is omitted, then the default value is urn:oasis:names:tc:SAML:2.0:bindings:URL-Encoding:DEFLATE, i.e. DEFLATE encoding which is supported by all endpoints.
Before applying the DEFLATE compression mechanism to the entire XML content of the original SAML protocol message, any signature on the SAML protocol message, including the <ds:Signature> XML element is removed. The compressed data is later base64-encoded with linefeeds and whitespaces removed. The base-64 encoded data is then URL-encoded, and added to the URL as SAMLRequest or SAMLResponse query string parameter based on whether the message is SAML request or response. RelayState is included with a SAML protocol message transmitted using HTTP redirect binding. The RelayState data is URL-encoded and placed in an additional RelayState query string parameter. The value of relaystate does not exceed 80 bytes in length and its validity is verified using a checksum with a pseudo-random value. The responder sends the RelayState parameter in the SAML protocol response. If the original SAML protocol message was signed with an XML digital signature then the URL-encoded form of the message is signed. An additional query string parameter SigAlg is included identifying the signature algorithm used to sign the URL-encoded SAML protocol message. Signature is constructed by concatenating RelayState if present, along with SigAlg and SAMLRequest (or SAMLResponse) query parameters ordered as SAMLRequest=value&RelayState=value&SigAlg=value. The resulting octet string is fed into the signature algorithm. The signature is then encoded using the base64 encoding with any whitespace removed, and included as a query string parameter named Signature. The supported signature algorithms are DSAwithSHA1 and RSAwithSHA1, with their URI representations supported with the encoding mechanism. The order of the query string parameters on the resulting URL while verifying signatures varies depending upon implementation. The URL encoding is not canonical and there are multiple encodings for a given value, hence the relying party performs the verification step using the original URL-encoded values it received on the query string. Sample SAML URL with signature is https://idp.com?SAMLResponse=xxxx&SigAlg=xxxx&Signature=xxxx
When the message is signed, the Destination XML attribute in the root SAML element of the message contains the URL to which the sender has instructed the user agent to deliver the message. The recipient verifies that the value matches with the location at which the message has been received.

Below are the HTTP request response message exchanges using the HTTP redirect binding.

When the user agent first makes an arbitrary HTTP request to a system entity, the system entity decides to initiate a SAML protocol exchange inorder to process the request.
The system entity acting as a SAML requester responds to the HTTP request from the user agent by returning a SAML request. The SAML request is returned encoded into the HTTP response's Location header with HTTP status 303 or 302. The user agent delivers the SAML request by issuing an HTTP GET request to the SAML responder.
The SAML responder may respond to the SAML request by immediately returning a SAML response or might return arbitrary content to facilitate subsequent interaction with the user agent necessary to fulfill the request.
The responder returns a SAML response to the user agent in a similar way as SAML requester responds to the HTTP request. The SAML response is then returned to the SAML requester.
Upon receiving the SAML response, the SAML requester returns an arbitrary HTTP response to the user agent.

If the signature and assertion are valid, the service provider establishes a session for the user and redirects the browser to the target resource.

HTTP POST Binding
SAML protocol messages can be transmitted within the base64-encoded content of an HTML form control using HTTP POST binding. It is used when the communicating parties do not share a direct path of communication and, SAML requester or responder need to communicate using an HTTP user agent as an intermediary. Also when the responder requires interactions with the user agent to fulfill the request, HTTP POST binding is used. XML Messages with this binding are encoded into an HTML form control and are transmitted using the HTTP POST method. A SAML protocol message is form-encoded by applying the base-64 encoding rules to the XML representation of the message and placing the result in a hidden form control within a form. Based on the message, SAML request or SAML response, the form control is named SAMLRequest or SAMLResponse respectively. RelayState is optional and included as RelayState hidden form control has maximum length of 80 bytes. The action attribute of the form is the recipient's HTTP endpoint to which the SAML message is delivered, with the method attribute being "POST". All the form control values are transformed to be included in an HTML document.
The intermediary user agent prevents to rely on the transport layer for end-end authentication. Hence SAML enables Form-encoded messages to be signed before applying base64 encoding. When the message is signed, the Destination XML attribute in the root SAML element contains the URL to which the sender instructed the user agent to deliver the message. The recipient verifies that the value matches the location at which the message has been received. The individual "RelayState" and SAML message values can be integrity protected, but not the combination.

Below are the HTTP request response message exchanges using the HTTP POST binding.

When the user agent makes an arbitrary HTTP request to a system entity, the system entity initiates a SAML protocol exchange.
The system entity acting as a SAML requester responds to an HTTP request from the user agent by returning a SAML request. The user agent delivers the SAML request by issuing an HTTP POST request to the SAML responder.
The SAML responder responds to the SAML request by immediately returning a SAML response or returns arbitrary content to facilitate subsequent interaction with the user agent necessary to fulfill the request.
The responder finally returns a SAML response to the user agent to be returned to the SAML requester.
Upon receiving the SAML response, the SAML requester returns an arbitrary HTTP response to the user agent.

HTTP Artifact Binding
In the HTTP Artifact binding, the SAML request or the SAML response, or both are transmitted by reference using a small stand-in called an artifact. A separate, synchronous binding, such as the SAML SOAP binding, is used to exchange the artifact for the actual protocol message using the artifact resolution protocol defined in the SAML assertions and protocols specification. The artifact binding can be composed with HTTP Redirect binding to transmit request and response messages in a single protocol exchange using two different bindings. Since the artifact binding is resolved using another synchronous binding, a direct communication path must exist between the SAML message sender and recipient in the reverse direction of the artifact's transmission. The URL parameter encoding or the HTML form control are used for artifact message encoding. RelayState with value upto 80 bytes can be included with a SAML artifact transmitted using artifact binding. The general format of an artifact includes a mandatory two-byte artifact type code (TypeCode) and a two-byte index value identifying a specific endpoint of the artifact resolution service of the issuer (EndpointIndex). Each issuer is assigned an identifying URI, also known as the issuer's entity. The artifact type code contains a 20 byte SourceID which is used by the artifact receiver to determine artifact issuer identity and the set of possible resolution endpoints maintained by destination site.

Below are the HTTP request response message exchanges using the HTTP Artifact binding.

When the user agent makes an arbitrary HTTP request to a system entity, the system entity decides to initiate a SAML protocol exchange.
The system entity acting as a SAML requester responds to an HTTP request from the user agent by returning an artifact representing a SAML request. If URL-encoded, the artifact is returned encoded into the HTTP response's Location header, and the HTTP status is either 303 or 302. If form-encoded, then the artifact is returned in an XHTML document containing the form and content. The user agent delivers the artifact by issuing either an HTTP GET or POST request to the SAML responder.
The SAML responder determines the SAML requester by examining the artifact and issues a <samlp:ArtifactResolve> request containing the artifact to the SAML requester using a direct SAML binding, thus temporarily reversing roles.
The SAML requester returns a <samlp:ArtifactResponse> containing the original SAML request message it wishes the SAML responder to process.
The SAML responder responds to the SAML request by either immediately returning a SAML artifact or returning an arbitrary content to facilitate subsequent interaction with the user agent necessary to fulfill the request.
The responder finally returns a SAML artifact to the user agent to be returned to the SAML requester. The SAML requester determines the SAML responder by examining the artifact, and issues a <samlp:ArtifactResolve> request containing the artifact to the SAML responder using a direct SAML binding.
The SAML responder returns a <samlp:ArtifactResponse> containing the SAML response message it wishes the requester to process.
Upon receiving the SAML response, the SAML requester returns an arbitrary HTTP response to the user agent.

SAML URI Binding

The SAML URI Binding supports the encapsulation of a <samlp:AssertionIDRequest> message with a single <saml:AssertionIDRef> into the resolution of a URI. URI resolution can occur over multiple underlying transports mostly HTTP with SSL 3.0. A SAML URI reference identifies a specific SAML assertion. The result of resolving the URI is a message containing the assertion, or a transport-specific error. The specific format of the message depends on the underlying transport protocol. If the transport protocol permits the returned content to be described, then the assertion may be encoded in a custom format. When the same URI reference is resolved in the future, then either the same SAML assertion, or an error, is returned. The SAML reference should consistently reference the same SAML assertion.

Thursday, January 14, 2016

Functional Programming in Scala

Functional programming (FP) is based on a simple premise that we only use pure functions with no side effects in programs which have far-reaching implications. A pure function is one that lacks side effects. A side effect is something which function does other than returning the result such as Modifying a variable or field in object, or writing to an external file. In functional programming, functions are first class citizens were a function is created within a function, functions are passed as arguments between functions or returned to another functions.

A function f with input type A and output type B (written in Scala as a single type: A => B, pronounced “A to B” or “A arrow B”) is a computation that relates every value a of type A to exactly one value b of type B such that b is determined solely by the value of a. Any changing state of an internal or external process is irrelevant to computing the result f(a). Hence when a function has no observable effect on the execution of the program other than to compute a result given its inputs, then we say that it has no side effects. For example, a function intToString having type Int => String will take every integer to a corresponding string. A pure function is modular and composable as it separates the logic of the computation itself from “what to do with the result” and “how to obtain the input”. Such computation logic is reusable with no side effects.

Referential transparency and purity: An expression e is referentially transparent (RT) if, for all programs p, all occurrences of e in p can be replaced by the result of evaluating e without affecting the meaning of p. A function f is pure if the expression f(x) is referentially transparent for all referentially transparent x. In other words, an expression to be referentially transparent—in any program, when the expression can be replaced by its result without changing the meaning of the program. Referential transparency forces the invariant that everything a function does is represented by the value that it returns, according to the result type of the function. When expressions are referentially transparent the computation proceeds using substitution model were at each step we replace a term with an equivalent one.

Data Types and Variables

Scala is a pure object-oriented language in the sense that everything is an object, including numbers or functions with no such thing as primitive type. Each object may have zero or more members, with either a member being declared as a method (using def keyword), or it can be another object declared with val or object.
Scala uses the syntax keyword var to declare a variable, while uses the keyword val to declare a constant. The value of the constant declared using val cannot be changed hence called immutable variable. The type of a variable is specified after the variable name with colon in between, and before equals sign (e.g. var sum:Int = 0). Variable may or may not have initial value during declaration. Scala compiler can determine the type of the variable based on the initial value assigned to the variable, which is called as variable type inference. In Scala statements are separated by newlines or by semicolons. Newlines delimit statements in a block. It should be noted that the ++ operator on numeric variables, e.g. x++ is not allowed in Scala.

Scala Type Hierarchy

There are no primitive types in Scala (unlike java). All data types in Scala are objects that have methods to operate on their data. All of Scala’s types exist as part of a type hierarchy, with every class defined automatically belongs to this hierarchy. Any is the superclass of all classes, also called the top class. It defines certain universal methods such as equals, hashCode, and toString. Any has two direct subclasses AnyVal and AnyRef.

AnyVal represents predefined value classes corresponding to the primitive types in Java. There are nine predefined value types and they are non-null able: Double, Float, Long, Int, Short, Byte, Char, Unit, and Boolean. Scala has both numeric (e.g., Int and Double) and non-numeric types (e.g., String) that can be used to define values and variables. Boolean variables can only be true or false. Char literals are written with single-quotes.

AnyRef represents reference classes. All non-value types are defined as reference types. User defined classes define reference types by default as they are always (indirectly) subclass of scala.AnyRef (Similar to java.lang.Object in Java).

The Empty values in Scala are represented by Null, null, Nil, Nothing, None, and Unit

Nothing is a trait which is sub-type of all value types and is also called the bottom type. There is no value that has type Nothing. Nothing is the return type for methods which never return normally such as a thrown exception, program exit, or an infinite loop. Scala compiler treats throw expressions as Nothing type as throw doesn't return an concrete value, but it should be any type.

Null is the type (trait) of the null literal. It is a subtype of every type except those of value classes. Hence the reference types can be assigned null but the value types cannot be assigned null. Null is provided mostly for interoperability with other JVM languages.

Unit: The Unit is Scala is analogous to the void in Java, which is utilized as a return type of a functions that is used with a function when the stated function does not returns anything.

Nil: Nil is Considered as a List which has zero elements in it. The type of Nil is List[Nothing] as Nothing has no instances, the List is confirmed to be desolated.

        println(null);
        //println(none) // gives error : not found : value none 
        println(Nil)

None: It is one of sub-classes of Scala's Option class - Some and None. It is used to avoid null pointer exception by assigning null to the reference types. None signifies no result from the method.

        //printing empty list
        println(None.toList) 
        //checking whether None is empty or not
        println(None.isEmpty)
        //printing value of None as string
        println(None.toString)

Scala Strings

Scala does not have its own String class and uses Java's java.lang. String with its methods for String operations. Every Java class is available in Scala. Since String class is immutable, StringBuilder should be used while making frequent String modifications.

A multi-line string literal is a sequence of characters enclosed in triple quotes """ ... """. The sequence of characters is arbitrary, except that it may contain three or more consecutive quote characters only at the very end.

String Interpolation allows to embed variable references directly in processed string literals. The string interpolator (s) when prepended to any string literal allows the usage of variables directly in the string. The s string interpolator can also take arbitrary expressions. The f interpolator when prepended to any string literal allows the creation of simple formatted strings, similar to printf in other languages. When using the f interpolator, all variable references should be followed by a printf-style format string, like %d. The raw interpolator is similar to the s interpolator except that it performs no escaping of literals within the string.

val name = "mark"
val age = 20
val amount = 345.67

println(name + " is " + age + " years old")  // string concatenation using + method
println("(%d -- %f -- %s)".format(age, amount, name))
println(s"$name is $age years old")  // S String Interpolation
println(f"$name%s is $age years old")  // F String Interpolation for type safe variable
println(raw"Hello \n World")  // Raw Interpolation, prints all strings literally with special string as it is.

For Loop

The for loop in Scala uses ranges for iteration and does not need a variable declaration e.g. "var i" in the loop. The format of the for loop is "for(i <- range)", were the arrow, <- is called the generator.

for(i <- 1 to 5) {}

for(i <- 1.to(5)) {}    // for loop using explicit to() function call

for(i <- 1.until(6)) {}    // until is similar but excludes last value in the range

for(i <- 1 to 5; j <- 1 to 3) {}  // multiple nested ranges

for(i <- 1 to 5; if i < 6) {}   // for loop using filter or guard condition

val list = List(3,4,6,8,9,32)
val result = for{ i <- list; if i < 6} yield {   // for loop as expression
  i * i
}

Yield Keyword
The yield keyword returns a result after completing of loop iterations. The for loop uses a buffer internally to store iterated result and when finishing all iterations it yields the ultimate result from that buffer. It doesn’t work like imperative loop. The type of the collection that is returned is the same type that is iterated over, hence a Map yields a Map, a List yields a List, and so on.

val xs = List(1, 2, 3, 4)
val x = for (x <- xs) yield x * 2                  // List(2, 4, 6, 8)
val x = for (i <- 1 to 20 if i % 2 == 0) yield(i)  // List(2, 4, 6, 8, 10, 12, 14, 16, 18, 20)

Match Expressions

In contrast with "exact matching" in Java's switch statements, pattern matching allows matching a pattern instead of an exact value. The match expressions consist of value to match, match keyword, multiple case clauses with code to execute when the pattern matches and default clause when no other pattern has matched. If the target matches the pattern in a case, the result of that case becomes the result of the entire match expression. If multiple patterns match the target, Scala chooses the first matching case.

age match {
  case "20" => age
  case "30" | "40" | "50" => age
  case _ => "default"
}

The default clause consists of the underscore character (_) and is the last of the case clauses. The variable pattern '_' is generally used to match any expression, e.g. List(1,2,3) match { case Cons(h,_) => h } results in 1 as List(1,2,3) is equals to Cons(1, Cons(2, Cons(3, Nil))). A pattern matches the target if there exists an assignment of variables in the pattern to subexpressions of the target that make it structurally equivalent to the target. The resulting expression for a matching case will then have access to these variable assignments in its local scope.

def sum(ints: List[Int]): Int = ints match {
   case Nil => 0
   case Cons(x,xs) => x + sum(xs)
}

Pattern guards are boolean expressions which are used to make cases more specific, by adding if expression after the pattern. The pattern can match not only Integers and Strings but any object type as shown in the example below.

def getClassAsString(x: Any): String = x match {
    case s: String => s + " is a String"
    case i: Int => "Int"
    case f: Float => "Float"
    case l: List[_] => "List"
    case p: Person if !p.name.isEmpty => "Person with non empty name"
    case _ => "Unknown"
}

Pattern matching to handle the exceptions thrown in try-catch blocks as below.

def catchBlocksPatternMatching(exception: Exception): String = {
  try {
    throw exception
  } catch {
    case ex: IllegalArgumentException => "It's an IllegalArgumentException"
    case ex: RuntimeException => "It's a RuntimeException"
    case _ => "It's an unknown kind of exception"
  }
}

Classes and Constructors

The class keyword introduces a class and contains the body within the curly braces.
Scala has a single primary constructor and many auxiliary constructors. Entire body of the Scala class is the primary constructor except the instance members defined. The argument list of primary constructor comes after the class name, were all the argument fields become class attributes. By default all attributes are public and immutable (val), and can be accessed directly. The var attributes can be overridden while the val attributes being immutable, cannot be modified. Getter methods and Setter methods are created automatically for var attributes, while val attributes have only getter methods created in the class. Class attribute variables can be declared public or private. The primary constructor can only call the base constructor or the super class constructor.

Auxiliary Constructor: A class can have many Auxiliary constructors but should have different signatures than one another. The Auxiliary Constructor require to call the primary constructor directly or indirectly. In other words, an auxiliary constructor must call either previously defined auxiliary constructors or primary constructors in the first line of its body. The auxiliary constructor is used for constructor overloading and defined as a method using this name.

   class Car(val year: Int, var miles: Int) {  // primary constructor

      println("car created")   // This println is part of primary constructor

      def this() {
         this(year, 0)   // All auxiliary constructors are required to go through primary constructor
      }

      def drive(dist: Int) {
         miles += dist
      }
   }

The new keyword is used to create an object instance by calling the class's constructors.

   val car = new Car(2010, 0)
   println(car.year)

In Scala, a getter and setter will be implicitly defined for all non-private var in an object. The getter name is same as the variable name, while "_=" is added with variable name for setter name.

   class Test {
     private var a = 0
     def age = a
     def age_=(n:Int) = {
        require(n>0)
        a = n
     }
   }

   val t = new Test
   t.age = 5

A pair of values can be returned by the method indicated with type enclosed within braces. A pair can be created by putting the values in parentheses separated by a comma.

def buyCoffee(cc: CreditCard): (Coffee, Charge) = { .. }

Scala Functions

Function in Scala is defined using by the def keyword which is followed by the name, the parameter list in parentheses and return type. The body of the function itself comes after a single equals sign. In Scala by default a parameter to a function is immutable. Scala allows to define the functions named as operators, e.g. +(), ++(), *() etc. Scala functions can be stored in a variable.

def functionName ([list of parameters]) : [return type] = { }

The last statement within the block is automatically returned, without specifying the "return" keyword. Every method has to return some value as long as it doesn’t crash or hang. The value returned from a method is simply whatever value results from evaluating the right-hand side. Scala compiler can infer the return types of methods based on the last statement but is considered bad style. A function, which does not return anything (called procedures), returns Unit which is equivalent to void in Java. The literal syntax for unit is (), i.e. a pair of empty parentheses. Scala looks at the main method with a specific signature, which takes an Array of Strings as its argument, and its return type is Unit, in order to begin the execution of the program. The App trait also can be used to quickly turn objects into executable programs as the object inheriting from App also inherits the main method.

Function parameters can have default values. The argument for such a parameter can optionally be omitted from a function call, in which case the corresponding argument will be filled in with the default.

   def main(args: Array[String]) {
        println( "Value with no parameters : " + addInt() );
        println( "Value with one parameter : " + addInt(5) );
}

   def addInt( a:Int=5, b:Int=7 ) : Int = {
      var sum:Int = 0
      sum = a + b
   }

In a normal function call, the arguments in the call are matched one by one in the order of the parameters of the called function. Named arguments allows to pass arguments to a function in a different order were each argument is preceded by a parameter name and an equals sign as below.

   def main(args: Array[String]) {
        printInt(b=5, a=7);
   }

   def printInt( a:Int, b:Int ) = {
      println("Value of a : " + a );
      println("Value of b : " + b );
   }

Any function name can be used infix, omitting the dot and parentheses when calling it with a single argument, e.g. instead of "Math.abs(42)" we can say "Math abs 42" and get the same result. Scala allows the last parameter to a function to be repeated indicated by '*' following the type, e.g. "String*" which actually is Array[String].

   def main(args: Array[String]) {
        printStrings("Hello", "Scala", "Python");
   }

   def printStrings( args:String* ) = {
      var i : Int = 0; 
      for( arg <- args ) {
         println("Arg value[" + i + "] = " + arg );
         i = i + 1;
      }
   }

In Scala strangely the curly brackets {} can be used in place of parentheses or round brackets (), especially for enclosing the parameters to method calls or the body of the for loop. Generally, functions accepting a single argument may be called with braces instead of parentheses in Scala, hence "Try { age.toInt }" is equivalent to "Try(age.toInt)".

   flatMapExample {32}
   val result = portal.flatMap {(a) => {a.toUpperCase}}

   for {
     n <- 1 to 100
     c <- letters
   } {
     print(n, c)
   }

Scala allows to define functions inside a function which are called local functions and are only visible inside the enclosing method.

Class Methods
A method is a function which is a member of an object (class). Private methods cannot be called from the code outside the owning object. All of an object’s non-private members can be brought into scope by using the underscore syntax, e.g. import "MyModule._". Method overriding during inheritance requires "override" keyword. An abstract class can be defined in Scala, which prevents from creating its instance. Methods are implicitly declared abstract if the equals sign and method body is missing from the method declaration. Scala allows to declare abstract fields similar to abstract methods which need to be inherited by subclasses.

Apply Method

Scala allows the objects that have a method with special name "apply" can be called as if they were themselves methods.

object Car {
 def apply(year: Int) = new Car(year, 0)
} 
 
val car = Car.apply(2013)
val car = Car(2013)      // same as above as apply can be dropped

Scala Extractors

Scala Extractor is defined as an object which has a method named unapply as one of its part. This method extracts an object and returns back the attributes. This method is also used in Pattern matching and Partial functions. Extractors also explains apply method, which takes the arguments and constructs an object so, it’s helpful in constructing values. The unapply method reverses the construction procedure of the apply method.

The return type of the unapply method can be selected like stated below:

If it is a checking procedure then return a Boolean Type.
If the procedure is returning only one sub-value of type T, then return an Option[T].
If the procedure is returning various sub-values of type T1, T2, …, Tn then return an optional tuple i.e, Option[(T1, T2, …, Tn)].
If the procedure returns an unpredictable number of values, then the extractors can be defined with an unapplySeq that returns an Option[Seq[T]].

object CustomerID {

  def apply(name: String) = s"$name--${Random.nextLong}"

  def unapply(customerID: String): Option[String] = {
    val stringArray: Array[String] = customerID.split("--")
    if (stringArray.tail.nonEmpty) Some(stringArray.head) else None
  }
}

val customer1ID = CustomerID("Sukyoung")  // Sukyoung--23098234908
customer1ID match {
  case CustomerID(name) ==> println(name)  // prints Sukyoung
  case _ ==> println("Could not extract a CustomerID")
}

Singleton and Companion Object

The object keyword creates a new singleton type, which is like a class that only has a single named instance similar to anonymous class in java. An object is scala's equivalent to java's static.

When we have the class and object with the same name then the object is called companion object. The class holds details of the instance while the companion object can access the details of the class instance including its private members. Everything located inside a companion object is not a part of the class’s runtime objects but is available from a static context. The companion object should reside in same source file as the class.

   object Car {        // Singleton in Scala, were only one instance of this Car class
      def countOfInstances() = {  // similar to static method in Java
      }
   }

   class Foo { }
   object Foo {         // Foo is a Companion object of Class Foo
       def apply() = new Foo
   }

   val foo1 = new Foo  // Creates new instance of Foo by calling actual constructor of Foo class
   val foo2 = Foo()    // Creates instance of Foo by calling apply method with Foo Companion object

A companion object in addition to the data type and its data constructors is an object with the same name as the data type where we put various convenience functions for creating or working with values of the data type. For example a function to fill the List data type with n copies of element a. When functions are in the companion object they are called as fn(obj, arg1), while when inside the body of the trait they are called as obj.fn(arg1) or obj fn arg1.

Value Classes

Value classes allows Scala compiler to use the inline value directly and to avoid allocating runtime objects. Value classes are similar to wrapper classes in Java using Autoboxing. A value class can only extend universal traits and cannot be extended itself. A universal trait is a trait that extends Any, only has defs as members, and does no initialization. Value class must have only a primary constructor with exactly one public, val parameter whose type is not a user-defined value class. It should not have specialized type parameters or nested or local classes, traits, or objects. It should not define equals or hashCode methods and should be a top-level class or a member of a statically accessible object. Value classes are immutable and cannot be extended by another class. There are nine predefined value types : Double, Float, Long, Int, Short, Byte, Char, Unit, and Boolean. Value classes can be combined with implicit classes for allocation-free extension methods.

class Wrapper(val underlying: Int) extends AnyVal {
  def foo: Wrapper = new Wrapper(underlying * 19)
}

implicit class RichInt(val self: Int) extends AnyVal {
  def toHexString: String = java.lang.Integer.toHexString(self)
}

Call By Name Parameters
In Scala, parameters to the functions are passed by value, by default. Alternatively, Scala also provides call-by-name parameters, which passes an expression to be evaluated within the called function. A call-by-name mechanism passes a code block to the callee (a nullary function which encapsulates the computation of the corresponding parameter) and each time the callee accesses the parameter, the code block is executed and the value is calculated. The call by name parameter syntax is by simply prepending the => symbol to the variable type. The call by-name parameters are evaluated every time when they are used as opposed to call by-value parameters which are evaluated only once. Call by-name parameters won't be evaluated at all if they aren't used in the function body. They are similar to replacing the by-name parameters with the passed expressions.

val callByName = (n: => Int) {         // example of call by name parameter in function
   println("Method call by name")
   println(n)
}

val add = (a: Int, b: Int) => {
   println("Add")
   a + b
}

callByName(add(5, 6))  // passing function to call-by-name parameter function

def performOperation1(op: => Unit) {   // another example of call by name parameter
   op
}

def performOperation2(op: () => Unit) {
  op()
}

performOperation1{ println("Done") } 
performOperation2(() => println("Hello!"))

def calculate(input: => Int) = input * 37   // call by name parameter in function

Case Class
A Case Class is similar to a regular class, except it has a feature for modeling immutable data. It also serves useful in pattern matching, such a class has a default apply() method which handles object construction. The case class also has all constructor parameters as vals, which means they are immutable by default. A companion object to the Case class is created and, apply and unapply methods are added. Hence we can create objects of the Case Class without using “new” keyword. Scala compiler also automatically adds default implementations of toString, hashCode and equals and copy methods. The copy method is used to create a copy of same instance with modifying few attributes or without modifying it. By default, Case class and Case Objects are Serializable.
Case object is also an object which is defined with “case” modifier. It also get same benefits to avoid boilerplate code, added toString and hashCode methods and is Serializable. A Case Class can extend another Class or Abstract Class or a Trait, but Case Class can NOT extend another Case class. A Case Class can override the variables and methods defined in the Trait like other classes.

case class Person(name:String)

object Person{
   def unapply(p:Person):Option[String] = Some(p.name)
   def apply(name:String):Person = new Person(name)
}

case class Person(name:String, age:Int)
val person1 = Person("Posa",30)

val person2 = Person("Posa",30)
val result = (person1==person2)  // == operator is used to compare objects

A deep copy is a copy to another object where any changes we make to it don’t reflect in the original object. A clone() method is used to create a deep copy of an object. A shallow copy, on the other hand is one where changes to the copy do reflect in the original. Scala uses the copy() method to carry out a shallow copy. Since case classes are immutable, a deep copy using clone() or shallow copy using copy() are used to make changes without changing the original.

Case classes are especially useful for pattern matching. In the below example the determineType() takes a parameter as Animal trait and matches on the type of Animal. It matches for the Dog and Cat case classes and the Woodpecker case object which are different subtypes of Animal trait. In first case Dog(name, _) the field name is used in the return value but the color field is ignored with _. If the Dog class is matched, its name is extracted and used in the print statement on the right side of the expression. When matching a Cat, we want to ignore the name, so used the syntax "_:Cat" to match any Cat instance. The anotherExample() show default syntax of matching by class type i.e. "c:Cat". Because Woodpecker is defined as a case object and has no name, it is matched by class name.

trait Animal
case class Dog(name: String, color: String) extends Animal
case class Cat(name: String) extends Animal
case object Woodpecker extends Animal

object CaseClassTest {

    def determineType(x: Animal): String = x match {
        case Dog(moniker, _) => s"Got a Dog, name = $name"
        case _:Cat => "Got a Cat (ignoring the name)"
        case Woodpecker => "That was a Woodpecker"
        case _ => "That was something else"
    }

    def anotherExample(x: Animal) = x match {
        case d: Dog => println(d.name)
        case c: Cat => println(c.name)
    }

    println(determineType(new Dog("Rocky")))
    println(determineType(new Cat("Rusty the Cat")))
    println(determineType(Woodpecker))
}

Inner Functions

In Scala, functions that are local to the body of another function are called an inner function, or local definition. They are used to write loops functionally, without mutating a loop variable, by using a recursive function.

def factorial(n: Int): Int = {
   def go(n: Int, acc: Int): Int =
      if (n <= 0) acc
      else go(n-1, n*acc)
   go(n, 1)
}

Anonymous Functions

Anonymous functions in Scala also called Function literals, have the arguments to the function declared to the left of the => arrow, while to the right of the arrow is the body of the function were the parameters can be used. The anonymous function (x,y) => x + y can be written as _ + _ in situations where the types of x and y could be inferred by Scala. Each underscore in an anonymous function expression like _ + _ introduces a new (unnamed) function parameter and references it. Anonymous functions can have either multiple parameters or no parameter at all.

    var multiply = (x: Int, y: Int) => x*y
    println(multiply(3, 4))

    var userDir = () => { System.getProperty("user.dir") }
    println( userDir )

A function literal or anonymous function under the hood is an object with an apply method. Hence (a, b) => a < b syntactically looks as below, were calling lessThan(10, 20) actually calls the apply method:

val lessThan = new Function2[Int, Int, Boolean] {
   def apply(a: Int, b: Int) = a < b
}

Higher Order Functions

In Scala, functions are values and can be assigned to variables, stored in data structures, and passed as arguments to functions. A function that accepts other functions as arguments. This is called a higher-order function (HOF). Higher Order functions enables to pass or return functions within a function.

def math(x: Double, y: Double, fn: (Double, Double) => Double) : Double = fn(x, y)
val result = math(50, 20, (x,y)=>x+y)
val result = math(50, 20, (x,y)=>x min y)

// HOF with more than two parameters, applied to binary function passed as argument
def math(x: Double, y: Double, z: Double, fn: (Double, Double) => Double) : Double = fn(fn(x, y),z)
val result = math(50, 20, 57, (x,y)=>x + y)

// Using Wildcard notations
val result = math(50, 20, 57, _ + _)
val result = math(50, 20, 57, _ max _)

Partially Applied Functions

Scala allows to apply functions partially to avoid passing redundant values when a method is invoked multiple times with the same value for a parameter. The constant parameter value can be eliminated by partially applying the argument to the method by binding a value to the constant parameter and leave the other parameters unbound by putting an underscore at their place. The resulting partially applied function is stored in a variable.

   def main(args: Array[String]) {
     val date = new Date
     val logWithDateBound = log(date, _ : String)
     logWithDateBound("message1" )
   }

   def log(date: Date, message: String)  = {
     println(date + "----" + message)
   }

Functions can fully applied functions were all arguments are applied, or partially applied functions were partial arguments are applied, below are more examples.

val add = (x: Int, y: Int, z: Int) => x + y + z  // fully applied function
add(10, 20, 30)

val add = (x: Int, y: Int, z: Int) => x + y + z  // partially applied function, were one argument is applied partially
val f = add(10, 20, _ : Int)
f(30)

val fun = add(10, _ : Int, _ : Int)  // partially applied function, were two arguments are applied partially
fun(100, 200)

Closures

A Closure is a function where the return value of the function depends on the value of one or more variables that have been declared outside the function. The changes made inside the closure are passed back as value. Impure closure is when datatype of the free (outside dependent) variable is var, whereas when the free variable datatype is val, then its a pure closure.

// number is called free variable in closure
var number = 10
val add = (x : Int) => x + number

def main(args: Array[String]) {
  println(add(10))
}

Currying
Function currying is a technique of transforming a function that takes multiple arguments into a function that takes a single argument. Currying transforms a function that takes multiple parameters into a chain of functions, each taking a single parameter. Curried functions are defined with multiple parameter lists.

   def strcat(s1: String)(s2: String) = s1 + s2
   // Alternate Syntax
   def strcat(s1: String) = (s2: String) => s1 + s2
   strcat("foo")("bar")

   def add(x: Int, y: Int) = x + y
   def add2(x: Int) = (y: Int) => x + y
   // Scala provides simpler syntax for currying
   def add3(x: Int)(y: Int) = x + y

   def main(args: Array[String]) {
     println(add(20, 10))
     println(add2(20)(10))
     
     val sum10 = add2(10)
     println(sum10(20))
     
   //  val sum30 = add3(30)  // Gives compilation error, as opposed to add2()
     val sum30 = add3(30)_
     println(sum30(300))
   }

In another example calling dropWhile is 'dropWhile(xs)(f)' were dropWhile(xs) is returning a function, which then calls with the argument f as below. Hence more generally, when a function definition contains multiple argument groups, type information flows from left to right across these argument groups.

def dropWhile[A](as: List[A])(f: A => Boolean): List[A] =
   as match {
   case Cons(h,t) if f(h) => dropWhile(t)(f)
   case _ => as
}

val xs: List[Int] = List(1,2,3,4,5)
val ex1 = dropWhile(xs)(x => x < 4)

Arrays

Array is a special kind of collection in scala. it is a fixed size data structure that stores elements of the same data type. The index of the first element of an array is zero and the last element is the total number of elements minus one. Scala arrays supports generics with an Array[T], where T is a type parameter or abstract type. Arrays are compatible with Scala sequences, were an Array[T] can be passed where a Seq[T] is required. They also support all sequence operations.

var arrayname = new Array[datatype](size)  // Array declaration syntax
val array1 : Array[Int] = new Array[Int](4)
val array2 = new Array[Int](5)
val array3 = Array(1,2,3,4,5,6)
array1(0) = 20

// Print array
for(x <- array1)
 println(x)

import Array._
concat(array1, array2) // concatenate Array

Lists

A list is a collection which represents a linked list and holds a sequenced, linear list of items. In Scala, Lists are immutable and each element in the list is of the same type.
In Scala, List has a Cons operator :: , which is short for construct the new List object. It is useful to add new elements at the beginning of the List. We cannot use the Cons operator to add a new element at the end of the List. Also Cons operator can only add elements to existing list or Nil list. Nil is a type list and represents an empty list.

val list1 : List[Int] = List(2,3,4,5,6,7)
val list2 : List[String] = List("One", "Two", "Three")
println(list1(0))    // fetches 0th element from the list, internally uses List.apply() method to fetch element.
list1(0) // get value of list at index 0
list1(0) = 9  // gives an compilation error as lists are immutable in Scala

val newlist = 0 :: list1  // cons is used to prepend/append elements to list 
val listA = 1 :: 5 :: 9 :: Nil    // represents List(1,5,9)
val listB = 1 :: 5 :: (9 :: Nil)  // represents List(1,5,9)

A List has various methods such as add, prepend, max, min, etc to perform various operations on the list. The head() and tail() methods are used to get the first and the last element of the list respectively. The reverse() method is used to reverse the list. The List.foreach() takes a function an applies to each element of the list. The List.fill(n)(x) creates a List with n copies of x. The takeWhile() method takes the elements from a list while the specified predicate is true. The dropWhile() method on the other hand drops the elements from the list while the specified predicate is true and returns the remaining list.

val xs: List[Int] = List(1,2,3,4,5)

val ex1 = dropWhile(xs, (x: Int) => x < 4)
// ex1 == List(1,2,3)

val ex2 = dropWhile(xs, (x: Int) => x > 3)
// ex2 == List(4,5)

val ex3 = List.fill(5)(2)  // List of 2s with 5 elements, result being List(2,2,2,2,2)

xs.foreach( println )  // Using foreach() method to print the list
var sum: Int = 0
xs.foreach(sum += _)   // Using foreach() method to calculate the sum of list

Below are few methods defined in List of standard library.

def take(n: Int): List[A] — Returns a list consisting of the first n elements of this.
def takeWhile(f: A => Boolean): List[A] — Returns a list consisting of the longest valid prefix of this whose elements all pass the predicate f.
def forall(f: A => Boolean): Boolean — Returns true if and only if all elements of this pass the predicate f.
def exists(f: A => Boolean): Boolean — Returns true if any element of this passes the predicate f.
scanLeft and scanRight — Like foldLeft and foldRight, but they return the List of partial results rather than just the final accumulated value.

The unzip method splits a list of pairs into a pair of lists. E.g. List[(Coffee, Charge)] is split by destructuring the pair to declare two values (coffees and charges) on one line.

The reduce method reduces the entire list of values into a single value, using the combine method of the value class to combine values two at a time.

    val (coffees, charges) = purchases.unzip(coffees, charges.reduce((c1,c2) => c1.combine(c2)))

Sets

Set is a collection were all the elements are unique, which is defined by the == method of the type. If a duplicate item is added to the set, the set quietly discards the add item request. Set can be both mutable and immutable. By default set Scala are immutable. In order to use mutable Set, the scala.collection.mutable.Set class should be imported explicitly. A Set has various methods to add, remove clear, size, etc. to enhance the usage of the set.

val set1 : Set[Int] = Set(2,3,4,5,6,7, 7)      // default Immutable set
val set2 : scala.collection.mutable.Set[Int] = scala.collection.mutable.Set(2,3,4,5,6,7, 7)  // Mutable set
val set3 = scala.collection.mutable.Set(2,3,4,5,6,7, 7)
val ispresent = set1(8)    // check if 8 is present in the set

println(set1 + 10)  // set in scala is not ordered and we cannot index sets, index cannot be used for set
println(set1 ++ set2)   // combines sets and shows unique values of 2 sets
println(set1 & set2)   // shows common values in 2 sets
println(set1.intersect(set2))   // shows common values in 2 sets
println(set1.min)

Maps

Map is a collection of key-value pairs. Keys are always unique while values may not be unique. Key-value pairs can have any data type, but must be consistent data type throughout. Similar to Sets, Maps in Scala are classified into mutable and immutable maps. By default Scala uses immutable Map. In order to use mutable Map, the scala.collection.mutable.Map class should be imported explicitly.

val map1 : Map[Int, String] = Map(801 -> "Max", 802 -> "Tom", 804 -> "June")
map1(802) // get value of map for the specified key
map1.keys
map1.values
map1.isEmpty
map1.contains(801)   // check if key contains in the map

map1.keys.foreach{ key =>
 println(key + " : " + map1(key))
}

println(map1 ++ map2)   // combines maps

Tuple

Tuple is a collection of elements in Scala. Tuples are heterogeneous data structures, i.e. they can store elements of different data types. A tuple is Immutable, unlike an array in scala which is mutable. Tuples cannot contain more than 22 elements, i.e. upto Tuple22. Scala has getter functions from "_1" to "_22" to fetch the corresponding tuple element. A tuple of two elements can be created using using 1 -> "Tom".

val tupleA = (1, 2, "hello", true)   // tuples are of fixed size and are immutable
val tupleB = new Tuple4(1, 2, "hello", true)  // Tuple4 means the new Tuple contains 4 elements
val tupleC = (1, "hello", (2,3)) 

println(tupleA._4)    // _1, _2, _3, _4 are created for Tuple4 tuple

tupleA.productIterator.foreach{
 i => println(i)
}

Seq Class

Seq is the interface in Scala’s collections library implemented by sequence-like data structures such as lists, queues, and vectors. The special _* type annotation allows us to pass a Seq to a variadic method. Variadic functions are just providing a little syntactic sugar for creating and passing a Seq of elements explicitly. Seq instance is immutable.

val x = Seq(1, 1.0, 1F)                // Seq[Double] = List(1.0, 1.0, 1.0)
val x: Seq[Number] = Seq(1, 1.0, 1F)   // Seq[Number] = List(1, 1.0, 1.0)
case class Person(name: String)
val people = Seq(
    Person("Emily"),
    Person("Hannah"),
    Person("Mercedes")
)
(1 to 5).toSeq                   // List(1, 2, 3, 4, 5)
Seq.range(1, 6, 2)               // List(1, 3, 5)
Seq.fill(3)("foo")               // List(foo, foo, foo)

Map and Filter Functions

The map() function is a higher order function available for every collection in Scala. It takes a function as a parameter, and applies that function to every element of the source collection. The map function returns a new collection of the same type as the source collection.

val listA = List(1, 2, 3)
val mapB = Map(1 -> "One", 2 -> "Two", 3 -> "Three")
println(listA.map(x => x * 2))   // double every element in the list
println(listA.map(_ * 2))
println(listA.map{ e => e * 2})

println(listA.map(x => "h1" + x))
println(mapB.mapValues(x => "hi" + x))

println("hello".map(_.toUpper))    // Map also can be used on string, to return HELLO

The flatten() method is utilized to disintegrate the elements of a Scala collection in order to construct a single collection with the elements of similar type.
The flatMap() method is identical to the map() method, but the only difference that in flatMap the inner grouping of an item is removed and the sequence is generated. It can be defined as a blend of map method and flatten method. The output obtained by running the map method followed by the flatten method is same as obtained by the flatMap().

val listOfList = List(List(1,2,3), List(4,5,6))
println(listOfList.flatten)                  // returns list with all elements from nested lists as part of single list

println(listOfList.flatMap(x => List(x, x+1)))

Filter is a predicate function which returns a boolean value by evaluating the expression.

println(lst.filter(x => x%2 == 0))

The map, flatmap, and filter collection functions return the Option type. The map function can be used to transform the result inside an Option, if it exists or else if None it aborts the remaining operation. The flatMap function is similar to map method, except that the function provided to transform the result can itself fail. The filter function is used to filter out the relevant values and mostly used within the chain of operations.

Option Class

Exceptions thrown in the functions make the method return value not referentially transparent. They break referential transparency, introducing context dependence, and should be used only for error handling and not for control flow. Exceptions are also not type-safe and the compiler does not know nor can enforce handling unknown exceptions which won’t be detected until runtime. Hence instead of throwing an exception, Scala provides the Option data type as an explicit return type when the function may not have a return value. Option has two cases: it can be defined, in which case it will be a Some, or it can be undefined, in which case it will be None.

The Option class is used to represent a carrier of single or no element for a stated type. The Option class acts as a container which can give two values, Some or None. When a method returns a value which can even be null when Option is utilized i.e, the method defined returns an instance of an Option, instead of returning a single object or a null. The instance of an Option returned can be an instance of either Some class or None class which are subclasses of Option class. The Option[T] class serves as a container for zero or one element of a given type T. If the element exists, it is an instance of Some[T]. If the element does not exist, it is an instance of None. Some of the popular methods to unwrap optional values in case class Some() are to use pattern matching using case, getOrElse() method and foreach can be used to extract optional values since the Option[T] class is a collection of zero or one elements of type T. The IsDefined Option method returns true if the Option does not have a None value and false otherwise. The getOrElse method is used to access the value of the Option or return empty or error value for error handling. A common idiom is to do getOrElse(throw new Exception("FAIL")) to convert the None case of an Option back to an exception. The orElse() method is similar to getOrElse(), except that we return another Option if the first is undefined.

        //printing empty list
        val sampleList = List(1, 2, 3)
        sampleList.find(_ > 0)   // This returns None
        sampleList.find(_ > 2)   // This returns Some(2)

        //To extract value from Some instance, the get() method is used on Option
        val result1 = sampleList.find(_ > 2).get  // Get method will return 2 for Some(2), for None result it throws an exception
        val result2 = sampleList.find(_ > 2).getOrElse(0)  // GetOrElse method returns else value when result is None instead of exception

        val option1 : Option[Int] = Some(5)
        println(option1.isEmpty))    // Option class allows to check if it has any value using isEmpty method
        println(option1.get))        // Return value 5

The Either data type is an extension to Option which allows to track the reason for failure. Either has only two cases were each case carries one value. The Right constructor is reserved for the success case and Left is used for failure.

def safeDiv(x: Int, y: Int): Either[Exception, Int] =
  try Right(x / y)
  catch { case e: Exception => Left(e) }

Reduce / Fold / Scan

The fold, reduce and scan are a family of higher-order functions which use a given combining operation, to recombine the results of recursively processing its constituent parts, building up a return value. The reduce/fold/scan functions apply a binary operator to each element of a collection. The result of each step is passed on to the next step (as input to one of the binary operator's second argument. The xLeft function variation is used to go forward through the collection, while the xRight function variation is used to go backwards through the collection.

Reduce (ReduceLeft/ReduceRight) takes an associative binary operator function as a parameter, applying to each element of collection to return a single cumulative result.

val list1 = List(1, 2, 4, 6, 7, 9, 13, 16, 20)
val list2 = List("A", "B", "C", "F", "G")

println(list1.reduceLeft(_ + _))    // 78
println(list2.reduceLeft(_ + _))    // ABCFG

Fold (FoldLeft/FoldRight) functions are similar as reduce, but an initial value can be passed into foldLeft or foldRight.

println(list1.foldLeft(100)(_ + _))    // 178 which is 100 + 78 which is total of all elements of the list
println(list2.foldLeft("Z")(_ + _))    // ZABCFG

val result = list1.foldLeft(0){(c,e) => c + e}

Scan (ScanLeft/ScanRight) functions are similar to Fold functions, except Scan functions provides a map of intermediate result values. It cumulates a collection of intermediate cumulative results using a start value.

println(list1.scanLeft(100)(_ + _))    // 100, 101, 103, 107, 113, 120, 129, 142, 158, 178
println(list2.scanLeft("Z")(_ + _))    // Z, ZA, ZAB, ZABC, ZABCF, ZABCFG

Strictness and laziness

Scala provides two methods for evaluation of expressions/functions, Lazy and Strict. Lazy mode delays the evaluation of the expression until its value is needed or used. The strict mode however evaluates the expression or function arguments immediately without delay. Scala has strict evaluation of expression by default, but allows lazy evaluation by explicitly using the lazy key word.

   def square(i: Int): Int = i*i

   lazy val l = square(15)/square(11)
   println(l)

Traits
Scala does not support multiple inheritance and provide traits to achieve the expected implementation. A trait is an abstract interface that may optionally contain implementations of some methods. Traits may contain abstract or non-abstract methods but should have at least one non-abstract method. When sealed is added in front trait it means that all implementations of the trait must be declared in the same file. Traits that are declared with no methods, functions, types or properties are called marker trait, e.g. scala.Immutable is a marker trait which indicate the semantics of immutability. Trait can be added at the class level as well as at the instance level as shown in the below example.

trait Friend {
 val name: String
 def listen() = println("I am " + name + " listening")
}
class Animal(val name: String)
class Dog(override val name: String) extends Animal(name) with Friend
class Cat(override val name: String) extends Animal(name)

def main(args: Array[String]) {
  val mycat = new Cat("mycat") with Friend
  mycat.listen()
  seekHelp(mycat)
}

Traits can be used as Decorator Pattern, selectively layering of functions without using multiple inheritance as below.

abstract class Writer {
   def write(msg: String)
}

class StringWriter extends Writer {
   val target = new StringWriter
   
   def write(msg: String) = target.append(msg)
   override def toString() = target.toString()
}

trait UpperCaseFilter extends Writer {
   abstract overrider def write(msg: String) = {
      super.write(msg.toUpperCase())    // Modify the input string and pass it to the next available class in trait hierarchy
                                        // i.e. StringWriter.write() method
   }
}

def write(writer: Writer) = {
   writer.write("This is Great")
}

def main(args: Array[String]) {
  write(new StringWriter)
  write(new StringWriter with UpperCaseFilter)
}

Traits and classes can be marked sealed which means all subtypes must be declared in the same file. This assures that all subtypes are known.

Implicits

Scala provides implicit parameters and conversions which allows to change or extend the standard libraries. Implicit allows to omit calling methods or referencing variables directly but instead rely on the compiler to make the connections. The compiler will call implicit methods or reference variables if the code doesn’t compile but would, if implicit function/variable is used. Implicit definitions are those that the compiler is allowed to insert into a program in order to fix any of its type errors. An implicit conversion is only inserted if there is no other possible conversion to insert and the implicit conversion is within the scope. The Scala compiler will only use one implicit conversion at a given time and will not change the code if it already works. There are three types of Implicit definitions:

Implicit parameters (aka implicit values) will be automatically passed values that have been marked as implicit. It means that if no value is supplied when called, the compiler will look for an implicit value and pass it in for you. The compiler can call the function with implicit parameter of type val, a var or even another def.

def multiply(implicit by: Int) = value * by

implicit val multiplier = 2

multiply

Implicit can be used only once in a parameter list and all parameters following it will be implicit.

def example1(implicit x: Int)                       // x is implicit
def example2(implicit x: Int, y: Int)               // x and y are implicit
def example3(x: Int, implicit y: Int)               // wont compile 
def example4(x: Int)(implicit y: Int)               // only y is implicit
def example5(implicit x: Int)(y: Int)               // wont compile
def example6(implicit x: Int)(implicit y: Int)      // wont compile

Implicit functions are defs that will be called automatically if the code wouldn’t otherwise compile. They’re typically used to create implicit conversion functions; single argument functions to automatically convert from one type to another. The references to implicit functions get applied to implicit arguments in the same way as references to implicit methods. To avoid implicit ambiguity, nested occurrences of an implicit take precedence over outer ones
```
implicit def intToStr(num: Int): String = s"The value is $num"

42.toUpperCase() // evaluates to "THE VALUE IS 42"

def functionTakingString(str: String) = str

// note that we're passing int
functionTakingString(42) // evaluates to "The value is 42"
```

Implicit classes extend behavior of existing classes you don’t otherwise control.

implicit class StringImprovements(s: String) {
 def increment = s.map(c => (c + 1).toChar)
}
  
val result = "HAL".increment

Tail Recursion

Scala detects self-recursion and compiles it to the same sort of bytecode as would be emitted for a while loop,as long as the recursive call is in tail position. A call is said to be in tail position if the caller does nothing other than return the value of the recursive call. If all recursive calls made by a function are in tail position, Scala automatically compiles the recursion to iterative loops that don’t cona function literalsume call stack frames for each iteration. we can tell the Scala compiler about tail call elimination using the tailrec annotation.

def findFirst[A](as: Array[A], p: A => Boolean): Int = {
  @annotation.tailrec
  def loop(n: Int): Int =
     if (n >= as.length) -1
     else if (p(as(n))) n
     else loop(n + 1)
        loop(0)
     }

Variance

Variance defines Inheritance relationships of Parameterized Types. Type parameters in Scala are written in square brackets, e.g. [A]. For List[T], the typed lists List[Int], List[AnyVal], etc. are known as "Parameterized Types" while T is called Type Parameter. Variance makes Scala collections more Type-Safe. Scala supports three types of variance, namely Covariant, Invariant and Contravariant.

Covariant: If "S" is subtype of "T" then List[S] is is a subtype of List[T]. To represent Covariance relationship between two Parameterized Types, Scala uses prefixing type parameter with "+" symbol. For example, List[+T], Set[+T] and Ordered[+T], where T is a Type Parameter and "+" symbol defines Scala Covariance.

Contravariant: If "S" is subtype of "T" then List[T] is is a subtype of List[S]. To represent Contravariant relationship between two Parameterized Types, Scala uses prefixing type parameter with "-" symbol, for List[-T].

Invariant: If "S" is subtype of "T" then List[S] and List[T] don’t have Inheritance Relationship or Sub-Typing. Such relationship between two Parameterized Types is known as "Invariant or Non-Variant". In Scala, by default Generic Types have Non-Variant (Invariant) relationship, were parameterized types are defined without using "+" or "-" symbols.

Scala Variance Type	Syntax	Description
Covariant	[+T]	If S is subtype of T, then List[S] is also subtype of List[T]
Contravariant	[-T]	If S is subtype of T, then List[T] is also subtype of List[S]
Invariant	[T]	If S is subtype of T, then List[S] and List[T] are unrelated

Type Bounds

Type Bounds are restrictions on Type Parameters (taken by generic classes) or Abstract Type members (taken by traits or abstract classes). By using Type Bounds limits can be defined to a Type Variable. Scala supports Upper Bounds, Lower Bounds and View Bounds for Type Variables.

Upper Bounds: The syntax for Upper Bound in Scala is [T <: S]. Here T is a Type Parameter and S is a type. It indicates that the Type Parameter T must be either same as S or Sub-Type of S.

  class Animal 
  class Dog extends Animal 
  class PitBull extends Dog 

  object ScalaUpperBounds {

    def display [T <: Dog](d : T) { 
       println(d) 
    } 

    def main(args: Array[String]) {
       display(new PitBull) 
       display(new Dog) 
    }
  }

In the above example an upper bound from Type Parameter T to Type Dog[T] is defined. Hence T here can be either Dog or subtype of Dog type.

Lower Bounds: The syntax for Lower Bound in Scala is [T >: S]. This indicates that the Type Parameter T must be either same as Type S or Sub-Type of S.

  class Animal 
  class Dog extends Animal 
  class PitBull extends Dog
  class Labrador extends Dog

  object ScalaLowerBounds {

    def display [T >: PitBull](d : T) { 
       println(d) 
    } 

    def main(args: Array[String]) {
       display(new PitBull) 
       display(new Dog)
       display(new Animal)
    }
  }

In the above example an lower bound from Type Parameter T to Type PitBull[T] is defined. Hence T must be either PitBull or superType of PitBull Type.

View Bounds: The View Bound allows to use the existing Implicit Conversions automatically. The syntax for View Bound in Scala is [T <% S]. View bound enables the use of some type A as if it were some type B. In the below example, A should have an implicit conversion to B available, so that one can call B methods on an object of type A. View bounds are deprecated.

def f[A <% B](a: A) = a.bMethod

Underscore Special Character

The underscore is a special character in Scala and in this case, allows for a space in the method name which essentially makes the name “age =”. This allows the method to be used in the same way as directly accessing the public property. In Scala, parentheses are usually optional. The line could just as easily been written as

person.age =(99)
// Or
person.age_=(99)
// Or
person.age = 99

Scala is a functional language. So we can treat function as a normal variable. If you try to assign a function to a new variable, the function will be invoked and the result will be assigned to the variable. This confusion occurs due to the optional braces for method invocation. We should use _ after the function name to assign it to another variable.

class Test {
  def fun = {
    // some code
  }
  val funLike = fun _
}

Standard Library Functions
Scala has Function1, Function2, Function3 and other interfaces known as traits provided by the standard Scala library which takes number of arguments indicated by the name. Scala’s standard library provides compose as a method on Function1, were two functions f and g can be composed by calling "f compose g". Also f andThen g is the same as g compose f. A functional data structure is (not surprisingly) operated on using only pure functions. functional data structures are by definition immutable.

Scala Streams
The Stream is a lazy lists where elements are evaluated only when they are needed. Streams have the same performance characteristics as lists. Similar to list's Cons operator :: , Stream has the Cons operator using the #:: operator method. Stream uses Stream.empty at the end of the expression to begin with empty stream.

val stream1: Stream[Int] = 1 #:: 2 #:: 3 #:: Stream.empty  // using #:: operator
val stream2: Stream[Int] = cons(1, cons(2, cons(3, Stream.empty) ) )       // using Stream.cons method

val stream3: Stream[Int] = Stream.from(1)   // create infinite stream

val emptyStream: Stream[Int] = Stream.empty[Int]  // initialize empty stream

println(s"Elements of stream1 = $stream1")         // Stream(1, ?)
stream2.take(3).print       // prints, 1, 2, 3, empty
stream2.take(10).print       // prints, 1, 2, 3, empty

Only first element of the stream is printed when stream is tried to be printed using println. The stream's take() method evaluates only the first specified number of elements from the stream, which can be used to perform operations.

Futures and Promises

A Future is an object holding a value which may become available at some point. It other words it is a placeholder object for a value that may not yet exist. The value is usually the result of some other computation, which determines the state of the feature. Depending on success or failure of the computation, the future is either completed with a value or completes with an exception thrown by the computation. Once a Future object (Future[T]) is given a value or an exception, it becomes immutable and cannot be overwritten. The Future.apply method starts (or schedules) an asynchronous computation and returns a future object holding the result of that computation. The result becomes available once the future completes.

import scala.concurrent._
import ExecutionContext.Implicits.global

val session = socialNetwork.createSessionFor("user", credentials)
val f: Future[List[Friend]] = Future {
  session.getFriends()
}

val title = Future {
  "hello" * 12 + "WORLD !!"
}

Futures are generally asynchronous and do not block the underlying execution threads. But futures also provide blocking of execution thread for certain cases. It provides blocking by either invoking arbitrary code that blocks the thread from within the future, or blocking from outside another future, waiting until that future gets completed.

val blockedForThisName = Future {
  blocking {
    "This is Blocked"
  }
}

While the Future is a read-only container, a promise is a writable, single-assignment container that is used to complete a future. Futures and Promises as two different sides of a pipe. On the promise side, data is pushed in, while on the future side, data can be pulled out. A promise can be used to either successfully complete a future with a value using the success method, or to complete a future with an exception, by failing the promise using the failure method.

val getNameFuture = Future { "Tom" }
val getNamePromise = Promise[String]()

getNamePromise completeWith getNameFuture

getNamePromise.future.onComplete {
  case Success(name) => println(s"Got the name: $name")
  case Failure(e) => e.printStackTrace()
}

By default, futures and promises are non-blocking, making use of callbacks instead of typical blocking operations. Future and Promises revolve around ExecutionContexts, responsible for executing computations.

ExecutionContext

An ExecutionContext is similar to an Executor were it executes computations in a new thread, in a pooled thread or in the current thread (which is discouraged). Scala provides an inbuilt scala.concurrent.ExecutionContext implementation with a global static pool. Also Executor can be converted to an ExecutionContext using the ExecutionContext.fromExecutor method which wraps a Java Executor into an ExecutionContext. Execution context executes the tasks submitted to them similar to thread pools. They are essential for the Future.apply method because they handle how and when the asynchronous computation is executed. We can either define our own execution contexts and use them with Future, or use the default execution context by importing ExecutionContext.Implicits.global. Below is an example were the execution of fatMatrix.inverse() is delegated to an ExecutionContext, and the result is provided to inverseFuture.

val inverseFuture: Future[Matrix] = Future {
  fatMatrix.inverse() // non-blocking long lasting computation
}(executionContext)

Global Execution Context

ExecutionContext.global is an ExecutionContext backed by a ForkJoinPool which manages a limited number of threads. Maximum number of threads is referred as parallelism level. The number of concurrently blocking computations can exceed the parallelism level only if each blocking call is wrapped inside a blocking call, otherwise the thread pool in global execution context is starved. By default the ExecutionContext.global sets the parallelism level of its underlying fork-join pool to the number of available processors using Runtime.availableProcessors. It can be overridden by minThreads, numThreads and maxThreads properties of scala.concurrent.context. The Global ExecutionContext can be imported from ExecutionContext.Implicits.global. Since ForkJoinPool is not designed for long lasting blocking operations, such long lasting blocking operations are wrapped using a dedicated ExecutionContext as below.

import scala.concurrent._
import ExecutionContext.Implicits.global

val session = socialNetwork.createSessionFor("user", credentials)
val f: Future[List[Friend]] = Future {
  session.getFriends()
}

Callbacks

When the client requires the value of the computation carried out by future, it would block its own computation and wait until the future is completed. The Future API provides such blocking call, it is recommended to do it is in a completely non-blocking way, by registering a callback on the future. The callback is then called asynchronously once the future is completed. The onComplete method which takes a callback function of type Try[T] => U, is the commonly used method to register the callback. The Try[T] is a monad similar to Option[T], which can either hold a value or some throwable object. Try[T] is a Success[T] when it holds a value and otherwise Failure[T], which holds an exception. The onComplete method allows the client to handle the result of both failed and successful future computations. To handle only successful results the foreach callback is used. The onComplete and foreach methods both have result type Unit, which means invocations of these methods cannot be chained.

The callback methods are executed either by the thread that completes the future or the thread which created the callback. The order of execution of callbacks is not predefined, as they can be called sequentially one after the other or concurrently at the same time. Although the ExecutionContext implementation mostly results in a well-defined order. The onComplete callback ensures that the corresponding method is invoked after the future is eventually completed, while foreach callback only invokes the method if the future is completed successfully. If callback is registered on the future which is already completed, it results in the callback being executed eventually. If one callback throws an exception, other callbacks are executed regardless. If some callbacks are never completed for e.g. due to an infinite loop, the other callbacks may not be executed at all, in which case blocking construct should be used. Once executed, the callbacks are removed from the future object, thus being eligible for GC.

import scala.util.{Success, Failure}

val f: Future[List[String]] = Future {
  session.getRecentPosts
}

f onComplete {
  case Success(posts) => for (post <- posts) println(post)
  case Failure(t) => println("An error has occurred: " + t.getMessage)
}

f foreach { posts =>
  for (post <- posts) println(post)
}

Combinators
The forEach and onComplete methods often result in overly idented and bulk code for real life scenarios were often there is multiple nesting of futures. To avoid this futures provide combinators which allow a more straightforward composition. Map is one of the basic combinator which, given a future and a mapping function for the value of the future, produces a new future that is completed with the mapped value once the original future is successfully completed. Hence futures can be mapped in the same way as collections. If the original future is completed successfully then the returned future is completed with a mapped value from the original future. If the mapping function throws an exception the future is completed with that exception. If the original future fails with an exception then the returned future also contains the same exception. Below is the example of map combinator.

val rateQuote = Future {
  connection.getCurrentValue(USD)
}

val purchase = rateQuote map { quote =>
  if (isProfitable(quote)) connection.buy(amount, quote)
  else throw new Exception("not profitable")
}

purchase foreach { amount =>
  println("Purchased " + amount + " USD")
}

Scala allows usage of futures in for-comprehensions, e.g. for (enumerators) yield e were an enumerator is either a generator which introduces new variables, or its a filter. Scala futures have the flatMap and withFilter combinators. The flatMap method takes a function that maps the value to a new future g, and then returns a future which is completed once g is completed. In other words, the flatMap operation maps its own value into some other future. Once this different future is completed, the resulting future is completed with its value. The filter combinator creates a new future which contains the value of the original future only if it satisfies some predicate. Otherwise, the new future is failed with a NoSuchElementException. The recover, recoverWith and fallbackTo combinators in general, creates a new future which holds the (same) result as the (original) future if it completed successfully.