1: Practically Designing and API to be An Asset

Any software you design will be used in ways you did not anticipate, that is especially true when you are design API’s which nowadays refers to Web REST APIs. There are different audiences or users for any API. There are the direct API users you know of that are defined in the specifications or solution design/architecture documents, like Web or Mobile clients for example. Then there are the ones you did not anticipate that would come at later stages. For example the API might be used to synchronize data with another system or to extract data for migration to another system. Nowadays, APIs are being considered as assets that would live longer than the projects or systems they are being built for. Taking that into account, there are many things that needs to be considered when designing an API. Let us assume we are designing and API for Authors/Books microservice as shown in the diagram below.

 

 

In the Book Author example, we want to implement CRUD (Create, read, Update and Delete) API for Author and Book entities. The requirements for this API

  1. Search and listing functionality to:
    1. Get a List Authors with various search terms
    2. Get author books
    3. Search books by Author, Genre
  2. Add, update, delete Author or Book

To implement these requirements and have this API as an asset for later use we take the following steps

1: Define the Data type for the API Data Objects

Based on the requirement we first define the schemas for the Author and Book objects that the API will support.

For each resource the API supports we define a schema for:

  1. Data Manipulation Object
  2. Full object
  3. Full Object with HATEOAS (Links for operations)

In RAML we can define those objects as follows :

#%RAML 1.0

title: GT Books API Types

types:


AuthorMod:


description: This is Author for Data Manipulation


type: object


properties:


Name:


required: true


example: Moustafa Refaat


description: Author Name


type: string


Nationality:


required: true


example: Canadian


description: Author Nationality


type: string


Date-of-Birth:


required: true


example: 2018-12-09


description: Author Date of Birth


type: date-only


Date-of-Death:


required: false


example: 2018-12-09


description: Author Date of Death


type: date-only

 


Author:


description: This is the Full Author


type: AuthorMod


properties:


Id:


required: true


example: 1


description: Author Id


type: integer


Age:


required: false


maximum: 200


minimum: 8


example: 10


description: Author Age


type: integer

 


AuthorHateoas:


description: Author with Hateoas information LINKS


type: Author


properties:


Links:


required: true


description: Property description


type: array


items:


required: true


type: Link

 


BookMod:


description:
Book Info for Data Manipulation


type: object


properties:


AuthorId:


required: true


example: 1


description: Author Id


type: integer


Name:


required: true


example: Example


description: Book Name


type: string


Genre:


required: true


example: Example


description: Book Genre


type: string


Stars-Rating:


required: false


maximum: 5


minimum: 0


example: 1


description: Book Rating


type: integer


ISBN:


required: true


example: Example


description: Book ISBN


type: string


PublishDate:


required: true


example: 2018-12-09


description: Book Publish Date


type: date-only

 


Book:


description: Book Info


type: BookMod


properties:


Id:


required: true


example: 1


description: Book Id


type: integer


AuthorName:


required: true


example: Moustafa Refaat


description: Author Name


type: string

 


BookHateoas:


description: Book Information with Hateoas links


type: Book


properties:


Links:


required: true


description: Property description


type: array


items:


required: true


type: Link

 


Link:


description: Hateoas LINK


type: object


properties:


href:


required: true


example: /Book/10


description: URL Link


type: string


rel:


required: true


example: GetBook


description: Operation


type: string


method:


required: true


example: GET


description: HTTP Method Get, PUT,..


type: string

 

2: Define URLs Resources for the API

For each Resource define and implement all the HTTP methods (Get, Post, Put, Delete, Patch, Head, Options) even if you are not going to use them now. You can make the implementation return “403 forbidden” message to the client to indicate that this operation is not supported. For our example we will have

  • /Authors
    • Get: Search/List Authors, return body list of authors matching the search/list criteria with headers containing the paging information
    • Post: Create a new Author, return created author
    • Put: Not Supported return “403 Forbidden” error
    • Delete: Not Supported return “403 Forbidden” error
    • Patch: Not Supported return “403 Forbidden” error
    • Head: Return empty body
    • Options: returns the supported methods by this resource “Get, Post, Head, Options”.
  • /Authors/{id}
    • Get: return author with supplied ID
    • Post: Not Supported return “403 Forbidden” error
    • Put: Update the author and return the updated author
    • Delete: Deletes the author
    • Patch: Not Supported return “403 Forbidden” error
    • Head: Not Supported return “403 Forbidden” error
    • Options: returns “Get, Put, Delete, Options”
  • /Authors/{id}/Books
    • Get: Search/List of books for Author, return body list of books matching the search/list criteria with headers containing the paging information
    • Post: Create a new book for author with Id returns created Book
    • Put: Not Supported return “403 Forbidden” error
    • Delete: Not Supported return “403 Forbidden” error
    • Patch: Not Supported return “403 Forbidden” error
    • Head: Returns same as Get with empty body
    • Options: Returns “Get, Post, Head, Options”
  • /Books
    • Get: Search/List of books, return body list of books matching the search/list criteria with headers containing the paging information
    • Post: Create a new book for author with Id returns created Book
    • Put: Not Supported return “403 Forbidden” error
    • Delete: Not Supported return “403 Forbidden” error
    • Patch: Not Supported return “403 Forbidden” error
    • Head: returns same as Get with empty body
    • Options: returns “Get, Post, Head, Options”
  • /Books/{id}
    • Get: Returns the book with ID
    • Post: Not Supported return “403 Forbidden” error
    • Put: update the book
    • Delete: Deletes the book
    • Patch: Not Supported return “403 Forbidden” error
    • Head: Not Supported return “403 Forbidden” error
    • Options: returns “Get, Put, Delete, Options”

 

To be continued…

Advertisements

MuleSoft: Understanding Exception Handling

I have been approached by several developers taking the Any point Platform Development: Fundamentals (Mule 4) training about the Exception Handling and the different scenarios in Module 10. The way it is described can be confusing. So here is how I understand it. Any MuleSoft flow like the one below

How this would be executed? Any Point studio and MuleSoft Runtime would convert it into a Java byte code. So it would be generated as a method or a function in Java. MuleSoft would put the code for this function within a Try -catch scope. If you have defined exceptions handlers in the error handling module it will emit code to catch and handle those exceptions. If there is not and there is a global error handler, it will emit code for the catch exceptions of the global error handler. Here is the thing that catches developers by surprise. If you have defined local handlers, then those cases are the only cases that would be handled and not the combination of the local cases and global error handler case, only one of them if there is a local error handler defined then that is it if there is not then the global error handler is emitted as catch cases

The 2nd point is On Error Propagate and On Error Continue options. If you chooser on Error Propagate then the coded emitted will throw the caught exception at the end of each catch. If you chose On Error continue then the Exception is not thrown. Think of it as if the code written below If you have been a Java , C#, C++. Or Python developer you should understand this basic programming concepts.


public void MianMethod(){
try {
OnErrorPropagate();
OnErrorContinue();
} catch (Exception E)
{
// default handler or global handler if defined
}
}
public void OnErrorPropagate() throws EOFException
{
try{
throw new EOFException();
}
catch( EOFException e)
{
Logger.getLogger(“ExceptionHandler”).log(Level.SEVERE,“Errot”,e.toString());
throw e;
}
}
public void OnErrorContinue()
{
try{
throw new EOFException();
}
catch( EOFException e)
{
Logger.getLogger(“ExceptionHandler”).log(Level.SEVERE,“Errot”,e.toString());

}
}
Hope this helps

MuleSoft: Tricky Question in the DataweaveTraining Quiz

I am working on MuleSoft certification and taking the MuleSoft 4.1 Fundamentals course. An interesting question in the module quiz is as bellow

Refer to the exhibit. What is valid DataWeave code to transform the input JSON payload to the output XML payload?

Answers

A.


B.


C.


D.


So here we need to have XML attributes so the “@” is required to define the attributes this disqualifies options B and D. Now A and C look very similar except that C uses a ‘;’ yep a semi-colon that is invalid, so the only correct answer is A.

The interesting thing about this question is that it is a small hidden trick. Maybe that is the kind of questions, we should expect in the certification exam?

MuleSoft: File:List Interesting Observation

Working with MuleSoft file connector, I was expecting that the File->List (https://docs.mulesoft.com/connectors/file/file-list) operation would return a list of fileInfo objects (you know path, size etc) but it actually it returns a list of the contents of the files in the directory. This seemed odd to me as the documentation states


“The List operation returns a List of Messages, where each message represents any file or folder found within the Directory Path (directoryPath). By default, the operation does not read or list files or folders within any sub-folders of directoryPath.
To list files or folders within any sub-folders, you can set the recursive parameter to


https://docs.mulesoft.com/connectors/file/file-list

Here is the sample I was working with

I was intending to put a read file operation in the foreach however that just gave me an error

Here is sample of the logged messages

That was a head scratch-er I thought I had done some mistake in the list parameters, but it seems that is how the file connector list operator works. Below you will see that part of the message for each fine the typeAttributes have the fileInfo information.

Practical REST API Design, implementation and Richardson Maturity Model

Richardson Maturity Model classifies REST API maturity as follows

  • Level Zero: These services have a single URI and use a single HTTP method (typically POST). For example, most Web Services (WS-*)-based services use a single URI to identify an endpoint, and HTTP POST to transfer SOAP-based payloads, effectively ignoring the rest of the HTTP verbs. Similarly, XML-RPC based services which send data as Plain Old XML (POX). These are the most primitive way of building SOA applications with a single POST method and using XML to communicate between services.
  • Level One: These services employ many URIs but only a single HTTP verb – generally HTTP POST. They give each individual resource in their universe a URI. Every resource is separately identified by a unique URI – and that makes them better than level zero.
  • Level Two: Level two services host numerous URI-addressable resources. Such services support several of the HTTP verbs on each exposed resource – Create, Read, Update and Delete (CRUD) services. Here the state of resources, typically representing business entities, can be manipulated over the network. Here service designer expects people to put some effort into mastering the APIs – generally by reading the supplied documentation. Level 2 is the good use-case of REST principles, which advocate using different verbs based on the HTTP request methods and the system can have multiple resources.
  • Level Three: Level three of maturity makes use of URIs and HTTP and HATEOAS. This is the most mature level of Richardson’s model which encourages easy discoverability and makes it easy for the responses to be self-explanatory by using HATEOAS. The service leads consumers through a trail of resources, causing application state transitions as a result.

Where HATEOAS (Hypermedia as the Engine of Application State) is a constraint of the REST application architecture that keeps the RESTful style architecture unique from most other network application architectures. The term “hypermedia” refers to any content that contains links to other forms of media such as images, movies, and text. This architectural style lets you use hypermedia links in the response contents so that the client can dynamically navigate to the appropriate resource by traversing the hypermedia links. This is conceptually the same as a web user navigating through web pages by clicking the appropriate hyperlinks to achieve a final goal. Like a human’s interaction with a website, a REST client hits an initial API URI and uses the server-provided links to dynamically discover available actions and access the resources it needs. The client need not have prior knowledge of the service or the different steps involved in a workflow. Additionally, the clients no longer have to hard code the URI structures for different resources. This allows the server to make URI changes as the API evolves without breaking the clients.

Naturally you would want to build to highest standard and provide level three REST API. That would mean providing a links field as in the following example below from GT-IDStorm API

The data payload as you can see from this sample is huge, compared to the actual data returned.

{

“value”: [

{

“id”: “63b2c70e-2bcb-4335-9961-3d14be642163”,

“name”: “Entity-1”,

“description”: “Testing Entity 1”,

“links”: [

{

“href”: “https://localhost:44379/api/v1/entity/63b2c70e-2bcb-4335-9961-3d14be642163”,

“rel”: “self”,

“method”: “GET”

},

{

“href”: null,

“rel”: “get_entitydefinition_byname”,

“method”: “GET”

},

{

“href”: “https://localhost:44379/api/v1/entity/63b2c70e-2bcb-4335-9961-3d14be642163/full”,

“rel”: “get_full_entitydefinition”,

“method”: “GET”

},

{

“href”: “https://localhost:44379/api/v1/entity/63b2c70e-2bcb-4335-9961-3d14be642163”,

“rel”: “delete_entitydefinition”,

“method”: “DELETE”

},

{

“href”: “https://localhost:44379/api/v1/entity/63b2c70e-2bcb-4335-9961-3d14be642163/attributes”,

“rel”: “create_attribute_for_entitydefinition”,

“method”: “POST”

},

{

“href”: “https://localhost:44379/api/v1/entity/63b2c70e-2bcb-4335-9961-3d14be642163/attributes”,

“rel”: “get_attributes_for_entitydefinition”,

“method”: “GET”

},

{

“href”: “https://localhost:44379/api/v1/entity/63b2c70e-2bcb-4335-9961-3d14be642163/systems”,

“rel”: “create_system_for_entitydefinition”,

“method”: “POST”

},

{

“href”: “https://localhost:44379/api/v1/entity/63b2c70e-2bcb-4335-9961-3d14be642163/systems”,

“rel”: “get_system_for_entitydefinition”,

“method”: “GET”

},

{

“href”: “https://localhost:44379/api/v1/entity/63b2c70e-2bcb-4335-9961-3d14be642163/data”,

“rel”: “get_data_for_entitydefinition”,

“method”: “GET”

},

{

“href”: “https://localhost:44379/api/v1/entity/63b2c70e-2bcb-4335-9961-3d14be642163/data/GetEntityDataWithMissingSystems”,

“rel”: “get_data_WithMissingSystems_for_entitydefinition”,

“method”: “GET”

}

]

},

{

“id”: “54bc1f18-0fd5-43dd-9309-4d8659e3aa91”,

“name”: “Entity-10”,

“description”: “Testing Entity 10”,

“links”: [

{

“href”: “https://localhost:44379/api/v1/entity/54bc1f18-0fd5-43dd-9309-4d8659e3aa91”,

“rel”: “self”,

“method”: “GET”

},

{

“href”: null,

“rel”: “get_entitydefinition_byname”,

“method”: “GET”

},

{

“href”: “https://localhost:44379/api/v1/entity/54bc1f18-0fd5-43dd-9309-4d8659e3aa91/full”,

“rel”: “get_full_entitydefinition”,

“method”: “GET”

},

{

“href”: “https://localhost:44379/api/v1/entity/54bc1f18-0fd5-43dd-9309-4d8659e3aa91”,

“rel”: “delete_entitydefinition”,

“method”: “DELETE”

},

{

“href”: “https://localhost:44379/api/v1/entity/54bc1f18-0fd5-43dd-9309-4d8659e3aa91/attributes”,

“rel”: “create_attribute_for_entitydefinition”,

“method”: “POST”

},

{

“href”: “https://localhost:44379/api/v1/entity/54bc1f18-0fd5-43dd-9309-4d8659e3aa91/attributes”,

“rel”: “get_attributes_for_entitydefinition”,

“method”: “GET”

},

{

“href”: “https://localhost:44379/api/v1/entity/54bc1f18-0fd5-43dd-9309-4d8659e3aa91/systems”,

“rel”: “create_system_for_entitydefinition”,

“method”: “POST”

},

{

“href”: “https://localhost:44379/api/v1/entity/54bc1f18-0fd5-43dd-9309-4d8659e3aa91/systems”,

“rel”: “get_system_for_entitydefinition”,

“method”: “GET”

},

{

“href”: “https://localhost:44379/api/v1/entity/54bc1f18-0fd5-43dd-9309-4d8659e3aa91/data”,

“rel”: “get_data_for_entitydefinition”,

“method”: “GET”

},

{

“href”: “https://localhost:44379/api/v1/entity/54bc1f18-0fd5-43dd-9309-4d8659e3aa91/data/GetEntityDataWithMissingSystems”,

“rel”: “get_data_WithMissingSystems_for_entitydefinition”,

“method”: “GET”

}

]

}

],

“links”: [

{

“href”: “https://localhost:44379/api/v1/entity?orderBy=Name&searchQuery=Testing%20Entity%201&pageNumber=1&pageSize=10”,

“rel”: “self”,

“method”: “GET”

}

]

}

For example if we remove the HATEOAS requirement that data returned for the same query would be

This would less data would have huge impact on the system as a whole performance, Less traffic on the network, less data to process and manipulate by the client and servers.

[

{

“id”: “63b2c70e-2bcb-4335-9961-3d14be642163”,

“name”: “Entity-1”,

“description”: “Testing Entity 1”

},

{

“id”: “54bc1f18-0fd5-43dd-9309-4d8659e3aa91”,

“name”: “Entity-10”,

“description”: “Testing Entity 10”

}

]

I usually implement the API to have and accept header with multiple options

  • Application/json: returns just the data
  • Application/hateoas+json: return the data with the hateoas (Links) data.

I also implement another resource or operation that provides the links structures

In Conclusion

I would recommend implementing the API to:

  • Support Level Two and Leve Three at the same time by using the accept header for the request to
    • Application/json: returns just the data
    • Application/hateoas+json: return the data with the hateoas (Links) data.
  • Implement another resource or the root that would return the URLs (structures and operations) that are supported by the API.

As I found just support HATEAOS only would make the system pay heavy price on performance specially with large data loads while very few clients if any would utilize the links returned. I would love to hear your thoughts and experience on APIs with HATEAOS?

SharePoint Content Management

I met with a previous colleague a few days ago who had move to an IT Research and advisory company. He was working on writing a best-practices recommendation for using SharePoint and the utilization of SharePoint Content Management capabilities in the enterprise data governance. Since, I had worked in many organizations before and utilized SharePoint in many solutions he wanted to get some insight from me. While it is flattering to be the man of insight I had concern about “Best Practices” concept. We are not talking about SharePoint in general which you can get that from Microsoft after all they are the vendor and know it best. In general, I would say there would be best practices would have to be divided into industry sectors (Retail, Banking, Finance, Manufacturing, Technology etc.) and for each organization there will be best practices in the organization context. Maybe that someone is seeking my insight had made me too philosophical so below are the questions he had for me and the answers I gave. Please share how you would have responded to them.

 

What were the key challenges that SharePoint helped you solve?

I have worked on so many SharePoint projects with many different challenges, A few of the major projects are:

  • City of Vaughn Online Portal implementation which was base on 2007 version. It was used for internal and external collaboration, cooperation and communications with the community of the city of Vaughan residents.
  • Addeco USA which was 2010 SP implementation. This was exposing different portals for different Addeco Brands of Head Hunting web sites, users would search for jobs, post resumes, create profiles and communicate with the Head Hunters through it.
  • For the Government of Ontario, I worked on multiple SharePoint projects building portals for cooperation and processing.
  • For TPS, I have built several SharePoint based applications for automating workflows. The most famous one is the Paid Duty Management system (basically overtime for uniform officers) where officers can apply for Paid Duty. And the system based on the business rules would apply

So mostly SharePoint was used as a platform to build applications on. Allowing me to concentrate on the core for the required Solution.

 

 

What architecture did you decide to adopt for SharePoint in your organization (cloud, premise, hybrid)?

Mostly had been on premise, lately with SharePoint 365 is the adoption of the cloud. Any new project is basically being built on the cloud or migrating the on-premise solutions to the cloud

If a migration took place, why did you choose to do this? If greenfield why this architecture?

The migration decision was mostly to utilize Azure.

Did you install and configure any third party add-ins?

On-premises there was many third party add-ins installed freely. On Azure you have to choose from MarketPlace. But in general yes , utilizing some of the COTS helps to reduce the complexity of your solution.

 

Do you use SharePoint for Web Content Management? If so what capabilities are usefull for you?

Yes, basically it was used for providing governance and control over the data and document being added to SharePoint.

What would you say the gaps are in SharePoints Offering for ECM?

There is always gaps in any offering, I can not name one right now.

How did the Web content management functionality work out for you in SharePoint?

It worked ok from the most part.

If you migrated what version did you migrate to and from?

Yes I did several migrations form 2003-2007 which was the hardest as the structure was completely different, basically used third-party tools to facilitate the movement and reformatting the data.

What benefits did you gain from the upgrade?

The newer versions always provides better performance new features etc. so utilizing the advantages of the new version over the older version.

What Backup strategies did you put in place for SharePoint?

This question reminds of an application I have built that would back up SharePoint document to AWS S3. I believe I have built for SharePoint 2007. Below is the flyer I had for this

 

Installing Hadoop 2.7.3 On MacOS Sierra Version 10.13.1

I have got Hadoop running on Windows 2012 and Ubuntu successfully and easily following the instruction online. However I have struggled getting it running correctly on iMac. So here is the correct way to set it up on iMac with MacOS Sierra as I could not find any online reliable instructions. Hope this helps

  • Verify that you have Java installed on your machine


  • From system Prefernce ->sharing enable “Remote Login”


    • Excute the following command

$ ssh localhost

The authenticity of host ‘localhost (::1)’ can’t be established.

ECDSA key fingerprint is

  • If the above is printed execute

$ ssh-keygen -t dsa -P ” -f ~/.ssh/id_dsa

Generating public/private dsa key pair.

Your identification has been saved in /Users/moustafarefaat/.ssh/id_dsa.

Your public key has been saved in /Users/moustafarefaat/.ssh/id_dsa.pub.

The key fingerprint is:

SHA256:GhGErnXItqkupPq/BorbfKQge2aPWZTXowlCW1ADkRc moustafarefaat@Moustafas-iMac.local

The key’s randomart image is:

+—[DSA 1024]—-+

| +=E+o |

| ..o. . |

| .+… |

| . o*..o |

| o+++o S |

|o.oo+o = . |

|=+ =. + |

|*o*+o |

|+OB=+. |

+—-[SHA256]—–+

. cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys

  • /usr/libexec/java_home

    /Library/Java/JavaVirtualMachines/jdk1.8.0_102.jdk/Contents/Home

  • Download Hadoop from

    http://apache.mirror.rafal.ca/hadoop/common/

    I have copied the downloaded directory to Hadoop on my machine and moved it to the users/shared/Hadoop folder

    in the distribution, edit the file etc/hadoop/hadoop-env.sh to define some parameters as follows

# or more contributor license agreements. See the NOTICE file

# distributed with this work for additional information

# regarding copyright ownership. The ASF licenses this file

# to you under the Apache License, Version 2.0 (the

# “License”); you may not use this file except in compliance

# with the License. You may obtain a copy of the License at

#

# http://www.apache.org/licenses/LICENSE-2.0

#

# Unless required by applicable law or agreed to in writing, software

# distributed under the License is distributed on an “AS IS” BASIS,

# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

# See the License for the specific language governing permissions and

# limitations under the License.

# Set Hadoop-specific environment variables here.

# The only required environment variable is JAVA_HOME. All others are

# optional. When running a distributed configuration it is best to

# set JAVA_HOME in this file, so that it is correctly defined on

# remote nodes.

# The java implementation to use.

export
JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.8.0_102.jdk/Contents/Home

#Moustafa

export HADOOP_HOME=/users/shared/hadoop

export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native

export HADOOP_OPTS=“-Djava.library.path=$HADOOP_HOME/lib/nativeenv”

# The jsvc implementation to use. Jsvc is required to run secure datanodes

# that bind to privileged ports to provide authentication of data transfer

# protocol. Jsvc is not required if SASL is configured for authentication of

# data transfer protocol using non-privileged ports.

#export JSVC_HOME=${JSVC_HOME}

export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop

# Extra Java CLASSPATH elements. Automatically insert capacity-scheduler.

for f in $HADOOP_HOME/contrib/capacity-scheduler/*.jar; do


if [ “$HADOOP_CLASSPATH” ]; then


export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$f


else


export HADOOP_CLASSPATH=$f


fi

done

# The maximum amount of heap to use, in MB. Default is 1000.

#export HADOOP_HEAPSIZE= 2000

#export HADOOP_NAMENODE_INIT_HEAPSIZE=””

# Extra Java runtime options. Empty by default.

export HADOOP_OPTS=“$HADOOP_OPTS -Djava.net.preferIPv4Stack=true”

# Command specific options appended to HADOOP_OPTS when specified

export HADOOP_NAMENODE_OPTS=“-Dhadoop.security.logger=${HADOOP_SECURITY_LOGGER:-INFO,RFAS} -Dhdfs.audit.logger=${HDFS_AUDIT_LOGGER:-INFO,NullAppender} $HADOOP_NAMENODE_OPTS”

export HADOOP_DATANODE_OPTS=“-Dhadoop.security.logger=ERROR,RFAS $HADOOP_DATANODE_OPTS”

export HADOOP_SECONDARYNAMENODE_OPTS=“-Dhadoop.security.logger=${HADOOP_SECURITY_LOGGER:-INFO,RFAS} -Dhdfs.audit.logger=${HDFS_AUDIT_LOGGER:-INFO,NullAppender} $HADOOP_SECONDARYNAMENODE_OPTS”

export HADOOP_NFS3_OPTS=“$HADOOP_NFS3_OPTS”

export HADOOP_PORTMAP_OPTS=“-Xmx512m $HADOOP_PORTMAP_OPTS”

# The following applies to multiple commands (fs, dfs, fsck, distcp etc)

export HADOOP_CLIENT_OPTS=“-Xmx512m $HADOOP_CLIENT_OPTS”

#HADOOP_JAVA_PLATFORM_OPTS=”-XX:-UsePerfData $HADOOP_JAVA_PLATFORM_OPTS”

# On secure datanodes, user to run the datanode as after dropping privileges.

# This **MUST** be uncommented to enable secure HDFS if using privileged ports

# to provide authentication of data transfer protocol. This **MUST NOT** be

# defined if SASL is configured for authentication of data transfer protocol

# using non-privileged ports.

export HADOOP_SECURE_DN_USER=${HADOOP_SECURE_DN_USER}

# Where log files are stored. $HADOOP_HOME/logs by default.

#export HADOOP_LOG_DIR=${HADOOP_LOG_DIR}/$USER

# Where log files are stored in the secure data environment.

export HADOOP_SECURE_DN_LOG_DIR=${HADOOP_LOG_DIR}/${HADOOP_HDFS_USER}

###

# HDFS Mover specific parameters

###

# Specify the JVM options to be used when starting the HDFS Mover.

# These options will be appended to the options specified as HADOOP_OPTS

# and therefore may override any similar flags set in HADOOP_OPTS

#

# export HADOOP_MOVER_OPTS=””

###

# Advanced Users Only!

###

# The directory where pid files are stored. /tmp by default.

# NOTE: this should be set to a directory that can only be written to by

# the user that will run the hadoop daemons. Otherwise there is the

# potential for a symlink attack.

export HADOOP_PID_DIR=${HADOOP_PID_DIR}

export HADOOP_SECURE_DN_PID_DIR=${HADOOP_PID_DIR}

# A string representing this instance of hadoop. $USER by default.

export HADOOP_IDENT_STRING=$USER

Edit coresite.xml to set the fa.defaultfs location to local host. While many websites recommend using port 9000 for some reason that port was used on my iMac so I used 7500 instead

<?xml version=“1.0” encoding=“UTF-8”?>

<?xml-stylesheet type=“text/xsl” href=“configuration.xsl”?>

<!–

Licensed under the Apache License, Version 2.0 (the “License”);

you may not use this file except in compliance with the License.

You may obtain a copy of the License at


http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software

distributed under the License is distributed on an “AS IS” BASIS,

WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

See the License for the specific language governing permissions and

limitations under the License. See accompanying LICENSE file.

–>

<!– Put site-specific property overrides in this file. –>

<configuration>

<property>


<name>fs.defaultFS</name>


<value>hdfs://localhost:7500</value>


</property>

</configuration>

create folder data in users\shared\hadoop and edit the file hdfs-site.xml as follows. Otherwise Hadoop will create the files in the /tmp folder

<?xml version=“1.0” encoding=“UTF-8”?>

<?xml-stylesheet type=“text/xsl” href=“configuration.xsl”?>

<!–

Licensed under the Apache License, Version 2.0 (the “License”);

you may not use this file except in compliance with the License.

You may obtain a copy of the License at


http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software

distributed under the License is distributed on an “AS IS” BASIS,

WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

See the License for the specific language governing permissions and

limitations under the License. See accompanying LICENSE file.

–>

<!– Put site-specific property overrides in this file. –>

<configuration>

<property>


<name>dfs.namenode.name.dir</name>


<value>/Users/Shared/hadoop/data/hadoop-${user.name}</value>


<description>A base for other temporary directories.</description
>

</property>

</configuration>

next edit the file yarn-site.xml as follows

?xml version=“1.0”?>

<!–

Licensed under the Apache License, Version 2.0 (the “License”);

you may not use this file except in compliance with the License.

You may obtain a copy of the License at


http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software

distributed under the License is distributed on an “AS IS” BASIS,

WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

See the License for the specific language governing permissions and

limitations under the License. See accompanying LICENSE file.

–>

<configuration>

<!– Site specific YARN configuration properties –>

<property>


<name>yarn.nodemanager.aux-services</name>


<value>mapreduce_shuffle</value>

</property>

</configuration>

Now you are ready to start from a terminal window run the command

. /users/shared/hadoop/etc/hadoop/hadoop-env.sh

notice the dot then space first this ensures the script will run in the current session. If you run the command printenv you should see all the variables set up correctly

now run the command to format the node (notice you get the

$HADOOP_HOME/bin/hdfs namenode -format

2016-11-09 16:19:18,802 INFO [main] namenode.NameNode (LogAdapter.java:info(47)) – STARTUP_MSG:

/************************************************************

STARTUP_MSG: Starting NameNode

STARTUP_MSG: host = moustafas-imac.local/192.168.2.21

STARTUP_MSG: args = [-format]

STARTUP_MSG: version = 2.7.3

STARTUP_MSG: classpath = …

STARTUP_MSG: build = https://git-wip-us.apache.org/repos/asf/hadoop.git -r baa91f7c6bc9cb92be5982de4719c1c8af91ccff; compiled by ‘root’ on 2016-08-18T01:41Z

STARTUP_MSG: java = 1.8.0_102

/************************************************************

SHUTDOWN_MSG: Shutting down NameNode at moustafas-imac.local/192.168.2.21

************************************************************/

open hosts file in /private/etc (note you will have to save the edited file to documents and then copy it with administrative privileges to add your Hadoop machine name

##

# Host Database

#

# localhost is used to configure the loopback interface

# when the system is booting. Do not change this entry.

##

127.0.0.1    localhost

255.255.255.255    broadcasthost

127.0.1.1 moustafas-imac.local

::1 localhost

now you are ready to start HDFS run the command

$HADOOP_HOME/sbin/start-dfs.sh

and to start YARN run

$HADOOP_HOME/sbin/start-yarn.sh

To test our setup run the following commands

$HADOOP_HOME/bin/hadoop fs -mkdir /user

Moustafas-iMac:hadoop moustafarefaat$ $HADOOP_HOME/bin/hadoop fs -mkdir /user/moustafarefaat

Moustafas-iMac:hadoop moustafarefaat$ $HADOOP_HOME/bin/hadoop fs -mkdir /user/moustafarefaat/input

$HADOOP_HOME/bin/hadoop fs -put $HADOOP_HOME/etc/hadoop/*.* /user/moustafarefaat/input

$HADOOP_HOME/bin/hadoop jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.3.jar grep input output ‘dfs[a-z.]+’

$HADOOP_HOME/bin/hdfs dfs -cat output/*

you should see results like below

if you browse to http://localhost:50070/explorer.html#/user/moustafarefaat

and if you browse to http://localhost:19888/jobhistory/app

you should see

Now your Hadoop on iMac is ready for use.

Hope this helps

ArchiMate 3.0 Simplified

Introduction

ArchiMate is a modeling language that provides a uniform representation for diagrams that describe Enterprise Architectures. An Enterprise Architecture describes stakeholders concerns, motivation and strategy. When modeling an enterprise architecture there are three main layers you want to describe:

Business Layer: Describes the processes the business uses to meet its goals

Application layer: Describes how specific applications are designed and how they interact with each other,

Technology layer: Describes the hardware and software infrastructure that supports applications and their interactions

There are three other layers (strategy, physical, and implementation and migration) that I will not address here. The ArchiMate modeling language defines three types of elements that are used to represent all the layers (Business, Application, and Technology)

Elements that act (active elements or a structure) such as (business actors, application components, nodes, and interfaces). An Active element can be:

An internal active structure element represents an entity that is capable of performing behavior.

A collaboration is an aggregate of two or more active structure elements, working together to perform some collective behavior.

An interaction is a unit of collective behavior performed by (a collaboration of) two or more active structure elements

An external active structure element, called an interface, represents a point of access where one or more services are provided to the environment.

Elements that represent the behavior of those elements that act (behavioral elements) (service, event)

An internal behavior element represents a unit of activity performed by one or more active structure elements.

An external behavior element, called a service, represents an explicitly defined exposed behavior.

An event is a behavior element that denotes a state change

Elements that cannot act and which are acted upon by that behavior (passive elements) (information , data, physical object)

The distinction between behavior and active structure is commonly used to separate what the system must do and how the system does it from the system constituents (people, applications, and infrastructure) that do it. In modeling new systems, it is often useful to start with the behaviors that the system must perform, while in modeling existing systems, it is often useful to start with the people, applications, and infrastructure that comprise the system, and then analyze in detail the behaviors performed by these active structure

Another distinction is between conceptual, logical, and physical abstraction levels. This has its roots in data modeling: conceptual elements represent the information the business finds relevant; logical elements provide logical structure to this information for manipulation by information systems; physical elements describe the storage of this information; for example, in the form of files or database tables. In the ArchiMate language, this corresponds with business objects, data objects, and artifacts, and the realization relationships between them.

Business Layer Elements


Active Elements

  • Business Actor: is a business entity that is capable of performing behavior.
  • Business Role: is the responsibility for performing specific behavior, to which an actor can be assigned, or the part an actor plays in a particular action or event
  • Business Collaboration: is an aggregate of two or more business internal active structure elements that work together to perform collective behavior.
  • Business Interface: is a point of access where a business service is made available to the environment.

Behavioral Elements

  • Business Process: represents a sequence of business behaviors that achieves a specific outcome such as a defined set of products or business services.
  • Business Function: is a collection of business behavior based on a chosen set of criteria (typically required business resources and/or competences), closely aligned to an organization, but not necessarily explicitly governed by the organization
  • Business Interaction: is a unit of collective business behavior performed by (a collaboration of) two or more business roles.
  • Business Service: represents an explicitly defined exposed business behavior.
  • Business Event: is a business behavior element that denotes an organizational state change. It may originate from and be resolved inside or outside the organization.

Passive Elements

  • Product: represents a coherent collection of services and/or passive structure elements, accompanied by a contract/set of agreements, which is offered as a whole to (internal or external) customers.
  • Representation: represents a perceptible form of the information carried by a business object.
  • Contract: represents a formal or informal specification of an agreement between a provider and a consumer that specifies the rights and obligations associated with a product and establishes functional and non-functional parameters for interaction.
  • Business Object : represents a concept used within a particular business domain

Application Layer Elements


Active Elements

  • Application Component: represents an encapsulation of application functionality aligned to implementation structure, which is modular and replaceable. It encapsulates its behavior and data, exposes services, and makes them available through interfaces
  • Application Collaboration: represents an aggregate of two or more application components that work together to perform collective application behavior.
  • Application Interface: represents a point of access where application services are made available to a user, another application component, or a node.

Behavioral Elements

  • Application Process: represents a sequence of application behaviors that achieves a specific outcome.
  • Application Function: represents automated behavior that can be performed by an application component.
  • Application Interaction: represents a unit of collective application behavior performed by (a collaboration of) two or more application components
  • Application Service: represents an explicitly defined exposed application behavior.
  • Application Event: is an application behavior element that denotes a state change.

Passive Elements

  • Data Object: represents data structured for automated processing.

Technology Layer Elements


Active Elements

  • Node: represents a computational or physical resource that hosts, manipulates, or interacts with other computational or physical resources.
  • Device: is a physical IT resource upon which system software and artifacts may be stored or deployed for execution.
  • System Software: represents software that provides or contributes to an environment for storing, executing, and using software or data deployed within it.
  • Technology Collaboration: represents an aggregate of two or more nodes that work together to perform collective technology behavior
  • Technology Interface: represents a point of access where technology services offered by a node can be accessed.
  • Path: represents a link between two or more nodes, through which these nodes can exchange data or material.
  • Communication Network: represents a set of structures and behaviors that connects computer systems or other electronic devices for transmission, routing, and reception of data or data-based communications such as voice and video.

Behavioral Elements

  • Technology Process: represents a sequence of technology behaviors that achieves a specific outcome.
  • Technology Function: represents a collection of technology behavior that can be performed by a node.
  • Technology Interaction: represents a unit of collective technology behavior performed by (a collaboration of) two or more nodes.
  • Technology Service: represents an explicitly defined exposed technology behavior.
  • Technology Event: is a technology behavior element that denotes a state change.

Physical Elements

  • Facility: represents a physical structure or environment.
  • Equipment: Equipment represents one or more physical machines, tools, or instruments that can create, use, store, move, or transform materials.
  • Distribution Network: represents a physical network used to transport materials or energy.
  • Material: tangible physical matter or physical elements.
  • Location: represents a geographic location.

Passive Elements

  • Technology Object: represents a passive element that is used or produced by technology behavior.
  • Artifact: represents a piece of data that is used or produced in a software development process, or by deployment and operation of an IT system.

Relationships

The ArchiMate language defines a core set of generic relationships, each of which can connect a predefined set of source and target elements, andin a few cases also other relationships). Many of these relationships exact meaning differs depending on the source and element that they connect.

Structural Relationships

Structural relationships represent the ‘static’ coherence within an architecture.

  • The composition relationship indicates that an element consists of one or more other elements. A composition relationship is always allowed between two instances of the same element type.
  • The aggregation relationship indicates that an element groups a number of other elements. An aggregation relationship is always allowed between two instances of the same element type.
  • The assignment relationship expresses the allocation of responsibility, performance of behavior, or execution.
  • The realization relationship indicates that an entity plays a critical role in the creation, achievement, sustenance, or operation of a more abstract entity.

Dependency Relationships

Dependency relationships describe how elements support or are used by other elements. Three types of dependency relationship are distinguished:

  • The serving relationship represents a control dependency, denoted by a solid line. The serving relationship models that an element provides its functionality to another element.
  • The access relationship represents a data dependency, denoted by a dashed line. The access relationship models the ability of behavior and active structure elements to observe or act upon passive structure elements.
  • The influence relationship is the weakest type of dependency, used to model how motivation elements are influenced by other elements. The influence relationship models that an element affects the implementation or achievement of some motivation element.

Dynamic Relationships

The dynamic relationships describe temporal dependencies between elements within the architecture. Two types of dynamic relationships are distinguished:

  • The triggering relationship represents a control flow between elements, denoted by a solid line. The triggering relationship describes a temporal or causal relationship between elements.

  • The flow relationship represents a data (or value) flow between elements, denoted by a dashed line. The flow relationship represents transfer from one element to another.

Other Relationships

  • The specialization relationship indicates that an element is a particular kind of another element.

  • An association models an unspecified relationship, or one that is not represented by another ArchiMate relationship.

  • A junction is used to connect relationships of the same type.

Further Readings

  • Open Group ArchiMate® 3.0 Specification
  • Enterprise Architecture at Work: Modeling, Communication, and Analysis, Third Edition, M.M. Lankhorst et al., Springer, 2013.
  • The Anatomy of the ArchiMate® Language, M.M. Lankhorst, H.A. Proper,H. Jonkers, International Journal of Information Systems Modeling and Design (IJISMD), 1(1):1-32, January-March 2010.
  • Extending Enterprise Architecture Modeling with Business Goals and Requirements, W. Engelsman, D.A.C. Quartel, H. Jonkers, M.J. van Sinderen,
  • Enterprise Information Systems, 5(1):9-36, 2011.
  • TOGAF® Version 9.1, an Open Group Standard (G116), December 2011, published by The Open Group; refer to: http://www.opengroup.org/bookstore/catalog/g116.htm.
  • TOGAF® Framework and ArchiMate® Modeling Language Harmonization: A Practitioner’s Guide to Using the TOGAF® Framework and the ArchiMate® Language, White Paper (W14C), December 2014, published by The Open Group; refer to: http://www.opengroup.org/bookstore/catalog/w14c.htm.
  • Unified Modeling Language®: Superstructure, Version 2.0 (formal/05-07-04), Object Management Group, August 2005.

Implementing Singleton pattern with BizTalk Orchestrations

With BizTalk Orchestrations, a new instance of the orchestration is created every time a new message arrives at the Receive Port. In a Singleton pattern only one instance should exist to handle all the messages (or events) in the system.  You can implement the singleton pattern using a simple correlation on the Receive Port name, as in figure.

 

Then in your orchestration you have two receive shapes, one to activate the orchestration, and the other inside an infinite loop (or you can put some condition on the loop so you can exit he orchestration on will, which is a good idea) that correlates to the receive port Name as in figure

 

That is all that is to it! Enjoy

Simplified BizTalk Content Based Routing for a Passthrough data

 I had simple task, I had image files that need to be routed (“copied”) to another location based on some field being set inside the image metadata. I devised a simple solution that consists of a simple pipeline component that promotes the filed into the Message Context, create a dummy schema with promoted field. Use the pipeline component in a custom pipeline. This solution would read the information and promote the field values if they exist without publishing the whole image into the message box database. Delivering performance similar to using the standard pass-throu pipeline. Bellow you can find the pipeline and the sample code of the component


///
<summary>


/// Implements IComponent.Execute method.


///
</summary>


///
<param name=”pc”>Pipeline context</param>


///
<param name=”inmsg”>Input message.</param>


///
<returns>Processed input message with appended or prepended data.</returns>


///
<remarks>


/// IComponent.Execute method is used to initiate


/// the processing of the message in pipeline component.


///
</remarks>


public
IBaseMessage Execute(IPipelineContext pContext, IBaseMessage pInMsg)

{


try

{


IBaseMessagePart bodyPart = pInMsg.BodyPart;


string imgsrc = null;


if (bodyPart != null)

{


Stream srcs = bodyPart.GetOriginalDataStream();


StreamReader tr = new
StreamReader(srcs);


string contents;


string beginCapture = “<photoshop:Source>”;


string endCapture = “</photoshop:Source>”;


int beginPos;


int endPos;

contents = tr.ReadToEnd();


Debug.Write(contents.Length + ” chars” + Environment.NewLine);

beginPos = contents.IndexOf(beginCapture, 0);


if (beginPos > 0)

{

endPos = contents.IndexOf(endCapture, 0);


Debug.Write(“xml found at pos: “ + beginPos.ToString() + ” – “ + endPos.ToString());

imgsrc = contents.Substring(beginPos + beginCapture.Length + 1, (endPos – (beginPos + beginCapture.Length + 1)));


Debug.Write(“Xml len: “ + imgsrc.Length.ToString() + ” Imgsrc = “ + imgsrc);

}

// return the cursor to the start of the stream

srcs.Seek(0, SeekOrigin.Begin);

}

pInMsg.Context.Promote(“Source”, @”http://MoustafaRefaat.ImagingPipeLine.PropertySchema.PropertySchema&#8221;, imgsrc);

}


catch (Exception ex)

{


Debug.WriteLine(ex.ToString());

}


return pInMsg;

}

If you have any questions I would love to hear from you