Showing posts with label Exalogic. Show all posts
Showing posts with label Exalogic. Show all posts

Wednesday, 25 November 2015

Networks that span multiple Engineered Systems/Exalogic Accounts

This blog post is to introduce some functionality that has fairly recently (~Oct 2015) become available that allows additional infiniband shared networks to be defined.  This enables internal networks to span accounts or be extended to other Engineered Systems.

Historically an Exalogic rack is setup with two internal (IPoIB) networks that have IP addresses which can be handed out to vServers in all accounts, these are the vServer Shared Storage and the IPoIB Default networks.  Any vServers on the storage network are limited members and full members of the infiniband default network. It is possible to override the membership of a virtual machine to allow vServers to communicate to each other internally on the Infiniband storage net.

Security concerns about using the IPoIB default network to allow inter-vServer communication alongside access to the database tier has meant that this network tends not to be used to allow cross-account conversations.   The only other mechanism to allow network traffic between accounts was to use a public EoIB network which has the downside of preventing the Infiniband high performance protocols and mandating the smaller MTU sizes and thus is sub-optimal for performance based applications.

Recent changes in Exadata have introduced support for the use of non-default partitions.  Indeed, when Exadata is setup to make use of the database running in a virtual machine the normal configuration will be such that there is no use of the IPoIB_default partition (0x7fff).   This was a problem for Exalogic which historically only had access to Exadata over the IPoIB-default network.

The standard configuration of a virtualised Exadata is to have two IB partitions, one that allows the database server to talk to the storage servers and another that will connect the virtual machine to another virtual machine on the Exadata so that a distributed RAC cluster can be setup and use IB for inter-cluster communications.  Obviously if Exalogic wants to communicate to Exadata using the Infiniband Optimised protocols the Exalogic must be able to link in with the Exadata over a non-default infiniband partition.  This is depicted in figure 1 below.


Figure 1 - Connecting EL and ED using non-default Infiniband Network

This example shows a two tier application deployed to Exalogic, the web tier which has access to the EoIB client network, potentially hosting an application like Oracle Traffic Director.  This can forward requests on to an application tier over an internal private network and then the application tier is linked to another IPoIB internal network but this is what might be considered a "public private network" meaning that this network can be handed out to vServers and provide linkage to the Exadata virtual machines which have had this specific network (partition) allocated to them.  The Exadata also has two other internal IB networks, one to allow the RAC cluster to communicate between the DB servers and another to allow access to the storage cells.

The approach to creating this non-default network that spans both Exalogic and Exadata introduces a couple of potential options.  Firstly to extend a private network from an Exalogic account into the Exadata rack and secondly to create a new Exalogic customer shared IPoIB network which can span multiple Exalogic accounts.

Extending a Private Network

In this scenario we create a private network within an Exalogic account and then expand the Infiniband partition into the Exadata.  This means that access to the Exadata is kept purely within an Exalogic account.  The steps to go through are:
  1. Create a private network in an Exalogic Account
  2. Edit the network to reserve the IP addresses in the subnet that the Exadata will use.
  3. Identify the  pkey value that this new network has been assigned
  4. Using the IB command line/Subnet Manager make the new partition extend to the Exadata switches and database servers.
  5. Recreate the Exadata virtual machines adding the new partition key to the virtual machine configuration file used.
  6. Configure the Exadata VM to use an IP address made unavailable to the Exalogic

Creating a new "Custom Shared IPoIB Network"

This is a slightly more flexible approach than the first scenario as we create a new "public private" network and then allocate IP addresses on this network to each account that will need access to it.  This is also useful in the use cases that Exadata is not involved because it allows certain virtual machines to be setup as a service provider and others as service consumers.  A provider being an IB full member of the partition and a consumer a limited member.  Thus all consumers can access and use the service provider functions but the consumers cannot "see" each other.

This example is is for the connected Exadata that we discussed earlier.  In this case the process to follow is:-

  1. Run the process to create the new IPoIB network.  It can be setup such that all vServers will be limited or full members by default, defines the IB Partition and specifies the subnet used as well as which IP addresses the Exalogic rack will use.
  2. Allocated a number of IP addresses from this new network to each account that will use it.  Same process that is used for EoIB networks, storage network or the IPoIB Default network today.
  3. Create vServers in the accounts with an IP address on the custom shared network.
  4. Identify the pkey for the custom network and extend the partition to the Exadata switches and DB server nodes.  The primary difference here is that if the Exadata was setup first then the first step in this process would have been to specify the pKey that was originally used by the Exadata.  (i.e. Either the Exadata or the Exalogic can be the first to specify the pKey.)
    1. Warning - The pKey being used is defined manually.  Make sure it will not overlap with any pKeys that Exalogic Control will assign.
  5. Recreate the database virtual machines assigning the pkey to their configuration and within the VM specify the IP address you want them to use.  
  6. Test 
Note - The technical details on how to achieve this are fully documented in an Oracle support note.  Get in touch with your local Oracle representative find out more.

Friday, 31 October 2014

Disaster Recovery of WLS Applications on Exalogic

Introduction

For many years Oracle Fusion Middleware based on WebLogic server has been capable of being used to provide high availability, fault tolerance and disaster recovery capabilities.  This has been documented as part of the Maximum Availability Architecture whitepapers. Follow this link for all the MAA documentation or follow this link to go directly to the Fusion Middleware Disaster Recovery architecture documentation.

Exalogic/Exadata provides an ideal platform on which these architecture can be realised with all the advantages that come with using Oracle Engineered systems.

This blog posting gives a very high level overview of the principles used in implementing active/passive DR for a fusion middleware application.  Much of the activity involved from an application perspective is identical irrespective of the deployment being on physical or virtual hardware.  In this article we will have a slightly deeper dive on how the Exalogic ZFS storage appliance is used to enable the DR solution.

Basic principles involved in setting up FMW DR

The basic tenet of deploying an application is to follow a set of rules during the deployment/configuration of the application which will make it simple to start the application up on the DR site.  The setup should be:
  1. Deploy all tiers of the application ensuring:-
    1. In the primary environment a set of hostname aliases are used for all configuration, these aliases not linked to the specific host and all configuration in the products specify these names rather than actual IP addresses.
    2. The binary files and application configuration  (normally the domain homes) are all located as shares on the ZFS appliance and mounted via NFS to the Exalogic vServers.
    3. Critical application data that must be persisted goes into the Database.  Specifically thinking of the WebLogic Transaction logs and the JMS messages.  (We will use the Oracle Data Guard product to ensure critical data is synchronously copied to the remote site)
    4. Keep the configuration in the Operating System to an absolutely minimum possible.  Probably no more than /etc/hosts entries and if needed specific service startup commands.  Other OS configuration should be built into the templates used to create the environment in the first place.
  2. Create mirror vServers on the DR site.
    1. These vServers will be used to host the production environment when DR has occurred.  The same minimal OS configuration should be present in this site.  To save time in DR the servers can be started up or they can be started on-demand at DR startup.  If already running then ensure that the application services are all shutdown.  The hosts files must have the same hostname aliases in it that the primary site has but obviously they will be resolving to different IP addresses.
  3. Create a replication agreement for all the shares that host the application binaries and domains.
  4. When DR is to happen   (ignoring DB)
    1. Break the replication agreement
    2. Export the replicated shares so that they can be mounted.
    3. Mount the replicated shares in exactly the same location on the DR vServers
    4. Startup the application on the DR environment
    5. Test and if OK then redirect traffic at the front end into the DR service.
Obviously this is somewhat simplified from most real world situations where you have to cope with managing other external resources, lifecycle management and patching etc.  However the approach is valid and can be worked into the operations run book and change management processes.

All these steps can be automated and put into the control of Enterprise Manager such that the element of human error can be removed from the equation during a disaster recovery activity.

Using the ZFS Storage Appliance for Replication

From the application perspective a key function lies with the NAS storage which has to be able to copy an application from one site to another.  The ZFS Storage appliance within an Exalogic is a fantastic product that provides exactly this functionality.  It is simple to set it up to copy the shares between sites.

Setup a Replication Network between sites

The first activity required when wishing to perform DR between two sites is to create a replication network between the ZFS appliance in both Exalogic racks.  This can be done using the existing 1GbE management network that already exists, however this is not recommended as this network is not fault tolerant, there being only one 1GbE switch in the rack.  However on the ZFS appliance there are two 1/10GbE network connections available on the back of each storage head (NET2 & NET3).  By default one connection goes into the 1GbE switch and the other is a dangling cable, thus two independent routes into the data centre are available.  If a longer wire is required to connect then it is possible to disconnect the existing ones and put in new cables.  (Recommendation - Get Oracle Field Engineers to do this, it is a tight squeeze getting into the ports and the engineers are experts at doing this!)

Once each head is connected via multiple routes to the datacenter and hence on to the remote Exalogic rack then you can use link aggregation to combine the ports on each head and then assign an IP address which can float from head to head so it is always on the active head and hence has access to the data in the disk array. 

Replicating the shares

Having setup the network such that the two storage appliances can access each other we now go through the process of enabling replication. This is a simple case of setting up the replication service and then configuring replication on each project/share that you want coped over.  Initialy setup the remote target where data will be copied to.  This is done via the BUI, selecting Configuration, and then the Remote Replication.  Click on the + symbol beside "Targets" to add the details (IP address and root password) of the remote ZFS appliance.

Adding a replication target
Once the target has been created we now setup the project/share to be replicated.  Generally speaking I would expect a project to be replicated which means that all the shares that are part of the project will be replicated, however it is possible to replicate at the share level only for a finer granularity.
To setup replication using the BUI simply click on the Shares and either pick a share or click on the Projects and edit the project level.  There is then a replication sub-tab and you can click on the "+" symbol to add a new "Action" to replicate. 

Replication of a project
Simply pick the Target that you setup earlier in the Remote Replication agreement, pick the pool - which will always be exalogic - and define the frequency.  Scheduled can be down to every half hour or Continuous means that it will start a new replication cycle as soon as the previous one completes.  There are a couple of other options to consider, Bandwidth limit so that you can prevent replication swamping a network, "SSL encryption" if the network between the two sites is considered insecure and "Include Snapshots" which will copy over the snapshots to the remote site. 

Obviously the latter two options have an impact on the quantity of data copied and the performance is worse if all data has to be travelling encrypted.  However, after the initial replication only changed blocks will be copied across and given that the shares are used primarily for binaries and configuration data there will not be a huge quantity flowing between the sites.

Process to mount the replica copies

Having completed the previous steps we have the binaries and configuration all held at the primary site and a copy on the remote site.  (Although bear in mind that the remote copy may be slightly out of date!  It is NOT synchronous replication.)  For DR we now assume that the primary site has been hit by a gas explosion or slightly less dramatic we are shutting down the primary site for maintenance so want to move all services to the DR environment.  The first thing to do is to stop the replication from the primary site.  If the DR environment is still running then this is as simple as disabling the replication agreement.  Obviously if there is no access to the primary then one must assume that replication has stopped.



Then on the DR site we want to make the replicated shares available to the vServers.  This is acheived by "exporting" the project/share.  To navigate to the replica share simply select the Shares and then the Projects listing or Shares listing appropriately.  Under the "Projects" or "Filesystems : LUNs" title you can click to see the Local or Replica filesystems.  By default the local are shown so click on Replica to see the data coped from a remote ZFS appliance.

Replicated Projects
We can then edit this project as you would for a local project.

Under the General tab there is the option to "Export", simply select this check box and hit apply and the share will be available to mount by the clients.  By default the same mount point that was on the primary site will be used on the DR site.

Health Warning : When you export a project/share then all shares with the same directory mount point are re-mounted on the client systems.  Make sure every project has a unique mount point.  If left at the default of /export then the Exalogic Control shares are also re-mounted which has the impact of rebooting compute nodes.  

Export checkbox to enable share to be mounted

Once the shares have been exported then the DR vServers can mount the shares, start the application services up and be ready to pick up from the primary site.  Finally, create the replication agreement to push data from the DR site back the primary until the failback happens to the primary site.

All the steps for DR once the environment has been correctly setup only take in the order of seconds to complete so the outage for the DR switchover can be taken down to seconds for the technical implementation aspects.


Monday, 10 March 2014

EM12c and Exalogic 2.0.6

Introduction

In an earlier posting I wrote about the process used to integrate a virtualised Exalogic rack with EM12c.  Since that article was written both EM12c and Exalogic have had an upgrade.  This posting is a short one to highlight the changes done/work needed to get the following combination of products working together:-
  • Exalogic - 2.0.6.n.n
  • EM12c - 12.1.0.3.0
In EM12c the plugins now used are:-
  • Oracle Virtualization - 12.1.0.5.0
  • Oracle Engineered System Healthchecks - 12.1.0.4.0
  • Sun ZFS Storage Appliances - 12.1.0.4.0

Agent Deploy

The steps to deploy the agent to EMOC and OVMM is the same as previously mentioned.  The changes from the previous instructions are:-
  • Now EMOC and OVMM are both on the same vServer so there is only one agent that needs to be deployed.
  • No need to create the /var/exalogic/info/em-context.info file.  However you do need to ensure the /var/log/cellos/SerialNumbers file is created and populated.  (See deployment notes)
  • The sudo permissions can be simplified down to the oracle user permissions of:-
    oracle ALL=(root) /usr/bin/id, /opt/oracle/em12c/agent/*/agentdeployroot.sh
Note 1 - I have done this agent deployment a number of times now and because the labs I am working in often do not have DNS fully setup I end up using the local hosts files for name resolutions.  It is critical that both the OMS server and the agent deployment target server have the target hostname fully qualified in their hosts files.  Otherwise the latter steps of deployment when EM12c attempts to "Secure Agent" will fail to startup the agent.

Note 2 - The same process applies to deploying an agent to a guest vServer.  However, be warned I did this on a vServer using the base 2.0.6.0.0 guest based template that had a couple of other small applications on it.  The agent deployment uses a reasonable amount of disk space (~1Gb) and during the deployment this can fail.  The log on the OMS server was reporting an error "pty required false  with no inputs".  It turns out that this was because the first step in the "Remote Prerequisite Check Details" performs an unzip of the installation media which was running out of disk space and hanging.  Killing the unzip process caused the step to fail and indicate that the likely cause was disk space.  To avoid this in the first place ensure there is adequate disk space on the vServer.

Note 3 - In order to ensure that a vServer is discovered correctly as part of the Exalogic rack it is necessary to ensure that the file /var/log/cellos/SerialNumbers is generated from the dmidecode command output.  The script shown below can be used to generate this. Simply cut and paste this into a file called generateSerialNo.sh, make it executable and run it on the vServer.


   serialCode=`dmidecode |grep Serial|grep -v Not|cut -d ":" -f2|cut -d " " -f2`
    if [ -f /var/log/cellos/SerialNumbers ]; then
        echo "File /var/log/cellos/SerialNumbers already exists."
    else
        mkdir -p /var/log/cellos
        echo "====START SERIAL NUMBERS====" > /var/log/cellos/SerialNumbers
        echo "==Motherboard, from dmidecode==" >> /var/log/cellos/SerialNumbers
        echo "--System serial--" >> /var/log/cellos/SerialNumbers
        echo "$serialCode" >> /var/log/cellos/SerialNumbers
        echo "--Chassis serial--" >> /var/log/cellos/SerialNumbers
        echo "$serialCode" >> /var/log/cellos/SerialNumbers
    fi



Note 4 - The configuration of the agent will fail if it cannot resolve its own hostname to a valid IP address. i.e. Make sure that /etc/hosts has an entry in it that specifies the hostname of the vServer you are adding.  (When creating vServers this does not happen by default as the names put into the /etc/hosts file are <server name>-<Network IP - dash separated>.)

Setup ZFS Appliance target

The setup process for the ZFS appliance is the same as described previously apart from a minor change that is required to the ZFS configuration.  During the process of seting up the appliance as a target it is necessary to run the "Configuring for Enterprise Manager monitoring" workflow.  This creates the user that will be used by the agent to log onto the appliance and gather the stats.  However the user that is created is defined as a kiosk user.  It is necessary to deselect this option from the user created (oracle_agent) because EM12c requires access to some other data from the console.  If this option is not deselected then EM12c discovers the appliance but raises an alert "Cannot monitor target. Incorrect Credentials"

Having corrected the kiosk user then in the EM12c interface select the ZFS target and under Monitoring --> All Metrics select the metrics you are interested in and enable the ones you are interested in.  (By default they are disabled.)

Conclusion

The general process for deployment is essentially unchanged, just a few minor variations on a theme. 


Tuesday, 18 February 2014

Some Exalogic ZFS Appliance security tips and tricks

Introduction

The ZFS appliance that is internal to Exalogic has been configured specifically for the rack, however while it is "internal" there are still a number of configuration options that should be considered when setting up a machine for production usage.  This blog posting is not an exhaustive list of all the security settings that can be done for a ZFS appliance but does pick off some configuration values that should be thought about whenever the appliance is being setup for use.

User Security

Once an Exalogic rack has been installed by default there will be a single root user of the ZFS array defined.  It is likely that other roles may need to create and manage storage space for their specific accounts.  Handing out the root privileges to other users is not recommended.

The permissions are determined via a three layered system.
  • Authorizations
    • Configuration items have CRUD (Create, Read, Update, Delete) like actions that can be taken.  
  • Roles
    • Each role defines a number of authorizations that can be performed by a user with that role
  • User
    • Defines either a local or remote directory based user that is allowed to authenticate to the ZFS appliance, the roles and hence authorizations will determine which activities the user is able to perform.
In most situations that I have come across the ZFS appliance is administered by the rack administrator so all system level configuration can be performed by one user.  However, there is often a need to be able to provide delegated administration to either an individual share or to all shares in a project.

Consider a scenario where the vDC is to be setup with an account that will host all vServers for Application A, the application may require some shares created to host the binaries and configuration files.  The machine administrator can initially create a project, say called application_a.  Then the role for administrating the project can be created.  To do this click on Configuration --> Users and click on the + symbol beside the Roles to create a new role. 
Create role to administer shares for a specific project
For the authorizations select the scope to be that of the Projects and Shares, then chose the exalogic storage pool and the project that was created earlier.  In this scenario we select all authorizations for all shares so that the user can create multiple shares as needed, although all within the context of the project.  (Click on Add to add the Authorisations selected and then click on add to create the user.) It is possible to only allow specific actions on the project or limit the administration to a single share.

Having created the role we now need to create a user and allocate the role to that user.

Creating a user with restricted permissions


In the example shown above we create a local user that will only have the role to administer the Application A project as limited by the selection of the roles associated with the user. 

Should that user then attempt to make a change to anything other than their project/share the system will respond with the following message.

Error reported when the authorisation has not been granted.



Project/Share Security

Having defined a user with limited access to the ZFS device we now turn our attention to the configuration that provides a level of security to help prevent malicious attacks on an NFS mounted share.  Most of the configuration settings for a share can also be set at the project level, as such we will discuss these first and remember that if necessary the inheritance can be overridden to give an individual share a unique configuration.

  • General
    • Space Usage
      • The quota can be used to prevent any shares in this project from exceeding a set size.  Handy to set to ensure that this project does not use all the available disk space on the device.
    • Mountpoint
      • Not strictly a security feature but it is good practice to always ensure that the project has a unique mountpoint defined.  By default a share will append the share name onto the project's mountpoint to determine the location in the ZFS appliances directory structure the data for the share.  A format that we use is to have all shares given a mount point of /export/<project name>/<share name>
    • Read Only
      • Obviously not possible in many cases but certainly at the share level you may wish to have the share setup as Read/Write initially and then change it to be read only so that users cannot accidentally delete the data on it.  (For example a binaries only filesystem.) During upgrades it could be switched back to read/write for the duration of the patching.
    • Filesystems - LUNS
      • Not directly applicable for Exalogic today but certification to use the iSCSI facility of the ZFS appliance is underway.  At which point then setting the user, group and permissions for LUNs created will be required.
  • Protocols
    • NFS 
      • Share Mode
        • Set to None so that by default a client cannot mount the filesystem unless they have specifically been given permission as an exception
      • Disable setuid/setgid file creation
      • Prevent clients from mounting subdirectories
        • Obviously security related but it will be up to the individual usecase to determine appropriate usage.
      • NFS Exceptions
        • Having set the share mode to None the usage of NFS Exceptions to allow clients to mount the share is mandatory. There are three mechanisms available to restrict access to a particular host or set of hosts.  Restricting by Host with a fully qualified domain name, by DNS domain or by network. 
          In general I have found the restriction by network to be the most useful but that is partly because DNS domains are often not used when setting up for short term tests.  When using the Network Type specify the "entity" to be a network using the CIDR notion.  So for example, I might want to restrict the share to only vServers in the network range 172.17.1.1 through to 172.17.1.14 in which case the entity should be set to 172.17.1.1/28.  The netmask can be taken down to an individual IP address /32 if only one vServer is allowed to mount the share.
          The access mode set to read/write or read only as is needed for the share usage.
          Root Access indicates if the root user on a client machine would have the root access to files on the share.  In general NFS terminology this is known as root squash.
Example NFS setup

    • HTTP, FTP & SFTP
      • Leave with share mode of None unless there is a specific need to allow these protocols to access data held on the share.
  • Access
    • This is a tab that has specific information for a share (other than the ACL Behaviour) so should be set independently for each share.  The Root Directory Access specifies the user/group and the file permissions that will be applied to the share when mounted on the client machine.  If using NFSv4 and hence some sort of shared user repository then the user and group are validated against this store, otherwise you can use values such as nobody:nobody to specify the user:group or enter the UID/GID of the users.  These IDs must map onto a user:group ID in the client machine.   The directory permissions set according to the needs of the application.
    • ACL
      • Very fine grained access to files and directories is managed via Access Control Lists (ACLs) which describe the permissions granted to specific users or groups.  More detail available from Wikipedia or in the NFSv4 specification (page 50) that is supported by the ZFS appliance.  In general I have found the default settings have been enough for my needs where the world can read the ACLs but only the owner has permission to change/delete them.

Administration Security

The ZFS appliance has many configuration settings  however to lock down the appliance it is possible to turn off a number of the services or re-configure them from the default to minimise risk of intrusion.
  • Data Services
    • NFS
    • iSCSI - If not used then disable the service.  (As of Exalogic 2.0.6.1.0 iSCSI is only supported for the Solaris Operating System.  In future releases it will also be supported for Linux/virtualised racks.)
    • SMB, FTP, HTTP, NDMP, SFTP, TFTP can all be disabled unless specifically needed for some function.  (For example, I quite often use the HTTP service to allow easy access to some media files or to host a yum server.)
  • Directory Services
    • Generally use either NIS, LDAP or Active Directory for a shared identity store.  Turn off the services you are not using.
  • System Settings
    • Most of the system settings are useful to have enabled on the rack.  The default settings of having Phone home and Syslog disabled are the best bet.
  • Remote Access
    •  SSH is almost certain to be required to administer the device via the CLI and using scripted configurations.  However if you setup another user with all necessary permissions then it is possible to change "Permit root login" to deselect this option.  This means that it will no longer be possible to use the root account to ssh onto the rack.  NOTE - If using exaBR, exaPatch, exaChk etc. then these rely on ssh access as root so the flag would need to be toggled back prior to running these tools.
 By default the appliance can be administered on all networks.  This can be tightened up so that administration can only occur over the specific management networks.  To disable administration on a particular interface select the Configuration --> Network --> Configuration tab and then highlight the Interface that you want to disable and click the edit icon to change the properties and deselect the Allow Administration option.

Preventing administration on a particular interface
It is possible to prevent administration on all the networks but the recommendation is to simply prevent it from the networks that a guest vServer can join.  Namely the IPoIB-vserver-shared-storage and the IPoIB-default.  These interfaces can be identified by the IP addresses or partition keys in the description shown in the browser interface.  The IPoIB-default network belonging to "via pffff_ibp1, pffff_ibp0" and the storage network will normally have an ip address in the 172.17.n.n network and be on partition 8005.  (via p8005_ibp1, p8005_ibp0) The partition for the shared storage may vary as it is configurable as part of the Exalogic Configuration Utility on the initial installation.

The effect of deselecting "Allow Administration" on the interface means that a browser will see an "Unable to connect" error and if the ssh interface is used then the following message is shown.

# ssh donald@172.17.0.9Password:
Password:
Last login: Tue Feb 18 11:51:00 2014 from 138.3.48.238
You cannot administer the appliance via this IP address.
Connection to 172.17.0.9 closed.

Summary

In conclusion, there are actually relatively few actions to be taken from the default settings of an Exalogic ZFS appliance but the following should always be considered:-
  1. Setup users to administer the projects and shares that are limited to only have write access to the shares they need.
  2. For each share make certain that only the protocols that are needed are allowed access (normally NFS only, and potentially iSCSI in the future) and ensure that only specific hosts are allowed to mount the shares
  3. Prevent administration on the networks that are connected to guest vServers.


Friday, 31 January 2014

Running DNS (bind) for a private DNS domain in Exalogic

In an earlier post I described a process to setup bind to provide a relay DNS service that can be accessed from internal vServers and the shared storage.  This provides an HA DNS service to the shared storage in particular as without such a setup it will be relying on the non-HA 1GbE network for access to DNS.

The next obvious step in the process is to extend your bind configuration so that a local DNS service can be used for the vServers you create.  This would give name resolution for guests that you do not want included in the external DNS service.

The first step is to setup bind or the named daemon as described in my earlier blog entry.  Ensure that the vServer you are using for the DNS service is connected to an EoIB network and the shared storage, this will mean that it becomes attached to three networks in total.
  1. the EoIB network which will give it access to the main DNS service in the datacenter, 
  2. the vServer-shared-storage which will allow the ZFS appliance to use this as a DNS server 
  3. the IPoIB-virt-admin network.  This is a network that is connected to all vServers so if we make the vServer a full member of this network as described earlier in a post about setting up LDAP on the rack then all vServers created can utilise the DNS services.  All we need to is to configure the network to use the domain service.

Once bind is operational then we can extend the named configuration to include details for an internal domain to the Exalogic rack.  So in this example our datacenter DNS runs on a domain of mycompany.com, for lookups internal to the Exalogic we want to use the domain el01.mycompany.com where el01 represents the Exalogic rack name.   The first step is to edit the main configuration file and add another section to specify that the bind service will be the master for the el01.mycompany.com domain.



# cat /etc/named.conf
options {
    directory "/var/named";

    # Hide version string for security
    version "not currently available";

    # Listen to the loopback device and internal networks only
    listen-on { 127.0.0.1; 172.16.0.14; 172.17.0.41; };
    listen-on-v6 { ::1; };

    # Do not query from the specified source port range
    # (Adjust depending your firewall configuration)
    avoid-v4-udp-ports { range 1 32767; };
    avoid-v6-udp-ports { range 1 32767; };

    # Forward all DNS queries to your DNS Servers
    forwarders { 10.5.5.4; 10.5.5.5; };
    forward only;

    # Expire negative answer ASAP.
    # i.e. Do not cache DNS query failure.
    max-ncache-ttl 3; # 3 seconds

    # Disable non-relevant operations
    allow-transfer { none; };
    allow-update-forwarding { none; };
    allow-notify { none; };
};

zone "el01.mycompany.com" in{
        type master;
        file "el01";
        allow-update{none;};
};
 

The extra section specifies that we will have a zone or DNS domain el01.mycompany.com. Within this zone this DNS server will be the master or authoritative source for all name resolution.  There is a file called el01 which will be the source of all the IP addresses that are served by this server.  Earlier in the configuration is the line

    directory "/var/named";

This specifies the directory that the named daemon will search in for the file called el01. The content of the file is as shown below.


# cat el01
; zone file for el01.mycompany.com
$TTL 2d    ; 172800 secs default TTL for zone
$ORIGIN el2h.mycompany.com.
@             IN      SOA   proxy.el01.mycompany.com. hostmaster.el01.mycompany.com. (
                        2003080800 ; se = serial number
                        12h        ; ref = refresh
                        15m        ; ret = update retry
                        3w         ; ex = expiry
                        3h         ; min = minimum
                        )
              IN      NS      proxy.el01.mycompany.com.
              MX      10      proxy.el01.mycompany.com.

; Server names for resolution in the el01.mycompany.com domain
el01sn-priv   IN      A         172.17.0.9
proxy         IN      A         172.16.0.12
ldap-proxy    IN      CNAME     proxy
 

The properties or directives in the zone file are:-

  1. TTL - Time to live.  If there are downstream name servers then this directive lets them know how long their cache can be valid for.
  2. ORIGIN - Defines the domain name that will be appended to any unqualified lookups.
  3. SOA - Start of Authority details
    1. The @ symbol places the domain name specified in the ORIGIN as the namespace being defined by this SOA record.
    2. The SOA directive is followed by the primary DNS server for the namespace and the e-mail address for the domain.  (Not used in our case but it needs to be present)
    3. The serial number is incremented each time the zone file is updated.  This allows the named to recognise that it needs to reload the content.
    4. The other values indicate time periods to wait for updates or to force refresh slave servers.
  4. NS - Name service - Determines the fully qualified domain for servers that are authoritative in this domain.
  5. MX - Mail eXchange, defines the mail server where mail sent to this domain is to be sent.
  6. A - Address record is used to specify the IP address for a particular name
  7. CNAME - The Cannonical Name which can be used to create aliases for a particular server.
In the example above we have added a few addresses into the DNS domain,
  1. The storage head under the name el01sn-priv.  This means that all vServers will automatically be able to resolve by name the storage for use with NFS mounts.
  2. proxy (or ldap-proxy) is the name that we are using for a server where OTD is installed and configured to be a proxy for an external directory.  Thus enabling all vServers to access LDAP for authentication.  (Useful for NFSv4 mounts from the shared storage.)
So once this is all up and running restart the named service and ensure that your DNS settings in the virt-admin network (in our case) include the search domain for el01.mycompany.com and the IP address for the DNS vServer.  As shown below.  This way every vServer created will be able to use the DNS service.



Thursday, 29 August 2013

Limiting OTD to listen only on the VIP address


In most production deployments OTD is likely to be deployed in a highly available configuration with two instances working as an active/hot-standby  load balancing pair.  (See my earlier posting on running OTD HA.)  In production the environment will almost certainly have a number of security constraints put on it, one of which will be to keep the number of listening ports to an absolute minimum.  In the case of OTD this will mean that it should only listen for incoming requests on the Virtual IP address, by default the listener will listen on all interfaces for the given port.

Thus we want to setup the configuration to listen on just the VIP, as shown below.



Where the IP Address is the IP address for the VIP.

Having done this an attempt to start up the server instance fails with the error messages shown below.


./startserv
Oracle Traffic Director 11.1.1.7.0 B01/14/2013 04:13
[ERROR:32] startup failure: could not bind to <Virtual IP>:8080 (Cannot assign requested address)
[ERROR:32] [OTD-10380] http-listener-1: http://<Virtual IP>:8080: Error creating socket (Address not available)
[ERROR:32] [OTD-10376] 1 listen sockets could not be created
[ERROR:32] server initialization failed

Alternatively if you attempt to start the instance via the GUI then an error message similar to that shown below will appear.



The reason for this failure is because the VIP is only ever active on one node at a time meaning that when the instance attempts to startup if the vServer it is on has not yet started up the VIP or the VIP is assigned to the other vServer in the HA group then it is impossible to bind to that interface.

Linux has the ability to allow binds to non-local IP addresses using the system configuration net.ipv4.ip_nonlocal_bind.  By setting this variable to 1 it allows the OTD instance to startup even although the IP address is not currently local to the running process.  To set this up simply edit the /etc/sysctl.conf file and add this with the value of 1.




# tail /etc/sysctl.conf

net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.core.netdev_max_backlog = 250000
vm.min_free_kbytes = 524288

# Additional entry to allow non-local binds so that we only listen to the VIP.

net.ipv4.ip_nonlocal_bind=1 
#
# sysctl -p
#

Once set then issue the sysctl -p command to refresh the configuration and we can startup the OTD instance.

You can check the currently running value in the /proc system files.

# cat /proc/sys/net/ipv4/ip_nonlocal_bind
1

#



Tuesday, 18 June 2013

Integrating Enterprise Manager 12c with Exalogic

Overview

This posting provides an overview of how Enterprise Manager 12c has been integrated with Exalogic.  It will then dive into the installation process, providing an overview of the activities that will compliment the documentation.  (For integrating with Exalogic 2.0.6.0.n read this posting in conjunction with the details shown here.)

The versions used in this example are:-
  • Virtualised Exalogic - 2.0.4.0.2
  • Enterprise Manager (EM12c) - 12.1.0.2.0
Oracle Enterprise Manager is a powerful tool that can be used to manage a large enterprise compute facility, looking after both the hardware and the software running. (apps to disk management)  As a model the normal operation is a central managed service or Oracle Management Service (OMS) that consists of an application hosted in WebLogic server with an underlying database to hold configuration and state information.  It communicates with the services it manages via an agent that is deployed onto each operating system.  Via the use of plugins and extensions it has specific knowledge of each environment and hence can present an appropriate monitoring and management screen.

For Exalogic there are several plugins that enable it to create a powerful view onto the rack and monitor it from apps to disk.  These plugins include:
  1. ZFS Storage - A specific plugin allows EM12c to communicate with the ZFS appliance within the Exalogic rack to monitor the status of the storage.
  2. Virtualisation - A plugin allows communication with the Oracle Virtual Machine Manager system used in Exalogic to provide details of how the virtual infrastructure is deployed and a view onto each virtual machine (vServer) created.
  3. Exalogic Elastic Cloud/Fusion Middleware - This plugin links in with the Exalogic Control infrastruture and gives information on the state of the physical environment.  It also links into agents deployed onto the vServers and provides a central view on the middleware software that can be deployed onto Exalogic.  (Built in understanding of Weblogic domains, applications deployed, Oracle Traffic Director installations and Coherence clusters.)
  4. Engineered Systems Healthchecks - A plugin that integrates with the exachk scripts to highlight any configuration inconsistencies.
The diagram below depicts a deployment topology for EM12c to monitor Exalogic.   There are more complex options available to make EM12c highly available and to manage firewalls and proxying of communications.  This blog posting is only really considering a basic installation for managing Exalogic.

OMS Deployment to monitor and manage an Exalogic rack
 There are plenty of alternate network configurations and deployment options that could be considered, the key thing is that the OMS server should have a network path to both the Exalogic Control vServers (OVMM & EMOC) and to the client created vServers that will be running the applications.

For example, in a purely test setup we have in the lab we actually run the OMS and OMS repository in a vServer on the Exalogic rack and make use of the IPoIB-virt-admin to give the OMS server suitable access to all the vServers on the rack.  This is great for test and demonstration purposes but in a large enterprise it is likely that the Enterprise Manager configuration will sit externally to the Exalogic.

This posting assumes that you already have an instance of Enterprise Manager 12c operational in your environment.  Details on the installation process can be found in the documentation.   This posting will continue to consider all the steps involved in configuring EM12c to monitor the Exalogic rack.

The installation documentation can be found here :-

EM12c Exalogic Configuration

As an overview the process is:-
  1. Get the correct versions of the software (plugins & EM12c) installed
  2. Deploy agents onto the OVMM & EMOC vservers in the Exalogic Control stack
  3. Deploy the ZFS Storage appliance plugin to monitor the storage
  4. Deploy the Exalogic Elastic Cloud plugin to get the Exalogic monitored.
  5. Deploy the Oracle Virtualization plugin to monitor the OVMM environment
  6. If deploying hosts onto the vServers setup your vServers as needed
  7. Optional - Deploy the Engineered System Healthchecks

Prerequisites

The process of installing/configuring the various components to allow the Exalogic to be monitored in EM12c involves a number of pre-requisites activities.

Ensuring you get the correct plugins

EM12c makes heavy use of plugins. Plugins are managed from the Extensibility menus. (Setup --> Extensibility --> "Self Update" or "Plug-ins")
If you have setup your EM12c instance in a network location that has access to the internet then you can automatically pick up the Oracle plugins from a well known location. Simply click where it says "Online" or "Offline" beside the Connection Mode under the Status in the Self Update page. If you are not able to access the internet then use the Offline mode and on the tab it shows the location for the em_catalog.zip, download this, move it to the OMS server and then Browse/Upload the file or make use of the command line (# emcli import_update_catalog -file <path to zip> -omslocal)
Once uploaded, on the "Plug-ins" page ensure that the following plugins are download and "On Management Server"
  • Oracle Virtualisation (12.1.0.3.0)
    • Note - This is not the most recent version as there is an incompatability with 12.1.0.4.0 and the OVMM instance that runs as part of Exalogic control. If you have 12.1.0.4.0 already deployed then undeploy it from the OMS instance.
  • Exalogic Elastic Cloud Infrastructure (12.1.0.1.0) - Not required for Virtual monitoring as the fusion middleware monitoring incorporates Exalogic.  Necessary for monitoring of a physical Exalogic rack.
  • Oracle Engineered System Healthchecks (12.1.0.3.0)
    • Not necessary for the general system monitoring but allows visibility and control over running exachk, the health checking tool for Exalogic.
  • Sun ZFS Storage Appliance (12.1.0.2.0)

 

Deploying Agents to EMOC & OVMM

For full integration with EM12c it is necessary to have agents deployed to both the OVMM and EMOC vServers. The agent binaries have already been deployed to the control vServers but as EM12c does all the deployment itself it is actually simpler to use the facilities of em12c to deploy into a new directory. As such the following instructions will deploy the agents onto the rack:-
  1. Ensure you have an oracle user and known password on the vServers. (oracle as a user is already present and as root use passwd to change the password to a known value.
  2. Create a directory to host the agent. eg.
    # mkdir -p /opt/oracle/em12c/agent
  3. Make the directory for the agent owned by the oracle user. (Check the group ownership on each vServer, on the OVMM the oracle user is in the dba group while on EMOC it is in the oracle group.)
    # chown -R oracle:oracle /opt/oracle
  4. If the vServers are not setup for DNS then ensure that the fully qualified hostname for the OMS server is included in the /etc/hosts file.
  5. Add the Exalogic info file to the template.
    On the hypervisor (OVS) nodes of the Exalogic rack is an identifier file that specifies the rack identifier.  The file is /var/exalogic/info/em-context.info. In the template create an equivalent directory structure and copy the em-context.info file into this directory.
  6. Make a symbolic link from the sshd file in /etc/pam.d to a file called emagent.  (Allows actions to be perfomed on the vServer using credentials managed in LDAP. - See MOS note How to Configure the Enterprise Management Agent Host Credentials for PAM and LDAP (Doc ID 422073.1) for more detail)
    # cd /etc/pam.d
  7. # ln -s sshd emagent

  8. Make the necessary changes to the sudoers configuration file (/etc/sudoers)
    1. Change Defaults !visiblepw to Defaults visiblepw
    2. Change Defaults requiretty to Defaults !requiretty
    3. Add the sudo permissions for the oracle user as shown below
      oracle ALL=(root) /usr/bin/id,/*/ADATMP_[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]_[0-9][0-9]-[0-9][0-9]-[0-9][0-9]-[AP]M/agentdeployroot.sh, /*/*/ADATMP_[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]_[0-9][0-9]-[0-9][0-9]-[0-9][0-9]-[AP]M/agentdeployroot.sh,/*/*/*/ADATMP_[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]_[0-9][0-9]-[0-9][0-9]-[0-9][0-9]-[AP]M/agentdeployroot.sh,/*/*/*/*/ADATMP_[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]_[0-9][0-9]-[0-9][0-9]-[0-9][0-9]-[AP]M/agentdeployroot.sh
    4. Now deploy the agent onto the vServer from the Enterprise Manager console.
      Setup --> Add Targets --> Add Target Manually & select the Add Host.... option and follow the wizard.
     

Setup to monitor the ZFS appliance

The ZFS appliance is monitored via an agent deployed to a device that has network access to the appliance, on an Exalogic the recommendation is to use the EM12c agent deployed to the Exalogic Control EMOC vServer.

Before setting up the monitoring in EM12c we have to run through a "workflow" on the ZFS storage appliance itself that will setup a user with appropriate permissions to monitor the appliance.   The agent hosting the ZFS Storage plugin will then communicate with the ZFS appliance as this user to gather details on the current operation.

To achieve this log onto the ZFS BUI as the root user and navigate to "Maintenance" --> "Workflows" Then run the workflow called "Configuring for Oracle Enterprise Manager" which will create the user and appropriate worksheet to allow the monitoring of the device.

Enabling the ZFS Storage for EM12c monitoring

This activity to create the user must be repeated on the second/standby storage head although it is not necessary to recreate the worksheet on the second head.

Once complete then on the Plugin's management page (Setup --> Extensibilty -->Plug-ins) deploy the ZFS Storage Appliance plugin to the OMS instance and then to the EMOC agent. You can then configure up the ZFS target, this is done via the Setup menu.
  • Setup --> Add Target --> Add Target Manually
  • Select "Add Non-Host Targets by Specifying Target Monitoring Properties"
    • Select the Target Type of Sun ZFS Storage Appliance & select the EMOC monitoring agent.
    • In the wizard give the name you would like the appliance to appear as in the EM12c interface and supply the credentials and IP address for the device. (Use the IB storage network, not the public 1GbE network IP.)
Once this completes it is possible to select the target for the storage appliance and view details on the shares created and the current usage of the device.

Monitoring of ZFS Storage Appliance

Deploying the Exalogic Infrastructure Plugin

There are a couple of steps to getting the environment setup for the Exalogic infrastructure plugin to operate correctly.
  1. Sort out the certificates so that the agent can communicate with the Ops Centre infrastructure of Exalogic Control
  2. Deploy/configure the plugin.

Managing the EMOC certificates

The first step is to ensure that the EM12c agent can communicate with the Ops Centre instance which is only available over a secure communications protocol. Because it uses a self-signed certificate it is necessary to include this certificate in the trust store of the agent.
  1. Export the certificate from the Ops Centre keystore. This is the keystore that is in the OEM installation on the ec1-vm vServers. (/etc/opt/sun/cacao2/instances/oem-ec/security/jsse) It is possible to use the JDK tools to extract the certificate.

    # cd /etc/opt/sun/cacao2/instances/oem-ec/security/jsse
    # /opt/oracle/em12c/agent/core/12.1.0.2.0/jdk/bin/keytool -export -alias cacao_agent -file oc.crt -keystore truststore -storepass trustpass

    Note 1 - The default password for the EMOC truststore is "trustpass".  Others have mentioned that the password was "welcome".  If trustpass does not work try out welcome.
    Note 2 - We explicitly use the keytool version that is shipped with the Oracle EM12c Agent (Java 1.6). The default version of java on the Exalogic Control vServer is java 1.4 and running the 1.4 version of keytool against the truststore will result in the following error:-

    # keytool -list keystore truststore
    Enter key store password: trustpass
    keytool error: gnu.javax.javax.crypto.keyring.MalformedKeyringException: incorrect magic
  2. Import the certificate you just exported into the agent's trust store. Ensure you import into the correct AgentTrust.jks file, specifically the one for the agent instance you are using and not (as the docs currently state) the copy in the agent binaries.
    # cd /opt/oracle/em12c/agent/agent_inst/sysman/config/montrust
    # /opt/oracle//em12c/agent/core/12.1.0.2.0/jdk/bin/keytool -import -keystore ./AgentTrust.jks -alias wlscertgencab -file /etc/opt/sun/cacao2/instances/oem-ec/security/jsse/oc.crt

Deploying the Exa Infrastructure Plugin

There are a number of steps to getting the Exalogic Infrastructure plugin to monitor the rack.
  1. Deploy the Exalogic Elastic Cloud Infrastructure to the OMS server. (Setup --> Extensibility --> Plug-ins, select the Exalogic Elastic Cloud Infrastructure and from the actions pick "Deploy on >" & "Management Servers" )
  2. Deploy the plugin to the Ops Center (EMOC) vServer. Once the plugin has been deployed successfully to the OMS instance then the same options as above but select to "Deploy on >" & "Management Agent..." and select the EMOC host agent.
  3. Now we want to run the Exalogic wizard to add the targets for the Exalogic rack itself. This is done via the Setup --> Add Target --> Add Targets Manually options. Then select "Add Non-Host Targets Using Guided Process (Also Adds Related Targets)", pick the Exalogic Elastic Cloud and click on "Add Using Guided Discovery" which will show the wizard as pictured below.

Discovery of Exalogic Elastic Cloud


This wizard appears to finish quickly and it is then possible to select the Exalogic from the Targets menu, however the system will be initialising and synchronising in the background so it takes a few minutes to get the full rack discovered. Once present the screenshots below show the monitoring of the hardware with a general picture for the rack and a couple of shots to show the Infiniband Network monitoring.


Exalogic Monitoring - Hardware view


Monitoring the Infiniband Fabric



Monitoring an Infiniband Switch

Deploying the OVMM Monitoring

The first thing to ensure is that the plugin version that is installed is the 12.1.0.3.0 version of Oracle Virtualization. The steps are similar to the steps for the Exalogic Infrastructure Plugin.

However prior to doing the deployment to Enterprise Manager the OVMM server should be setup to be read-only for the EM12c monitoring agent to use.  Follow these steps on the OVMM server to setup a user as read only.

Login to Oracle VM Manager vServer as oracle user, and then perform the commands in the sequence below.
  1. cd /u01/app/oracle/ovm-manager-3/ovm_shell
  2. sh ovm_shell.sh --url=tcp://localhost:54321 --username=admin --password=<ovmm admin user password>
  3. ovm = OvmClient.getOvmManager ()
  4. f = ovm.getFoundryContext ()
  5. j = ovm.createJob ( 'Setting EXALOGIC_ID' );
    The EXALOGIC_ID can be found in the em-context.info on dom0 located in the following file path location:
    /var/exalogic/info/em-context.info
    You must log in to dom0 as a root user to obtain this file. For example, if the em-context.info file content is ExalogicID=Oracle Exalogic X2-2 AK00018758, then the EXALOGIC_ID will be AK00018758.
  6. j.begin ();
  7. f.setAsset ( "EXALOGIC_ID", "<Exalogic ID for the Rack>");
  8. j.commit ();
  9. Ctrl/d

Now deploy the OVMM virtualisation plugins to the OMS server:-

  1. Deploy the Oracle Virtualization plugin to the OMS server
  2. Deploy the Oracle Virtualization plugin to the agent running on the OVMM server.
  3. Run the add target wizard for the Oracle VM Manager.
    1. Setup --> Add Target --> Add Target Manually
    2. Select the "Add Non-Host Targets by Specifying Target Monitoring Properties" & Chose the target type of "Oracle VM Manager" and the monitoring agent for the OVMM server host.
    3. Enter the details on the wizard page (example shown below)
    4. Submit the job, wait a few minutes to allow the discovery to progress and then you can view the Target under Systems or all targets.

    Running the discovery wizard for the Exalogic Virtualised Infrastructure

Deploying the Engineered System Healthchecks

Both Exalogic and Exadata have a healthcheck script that can be run - exachk. On Exalogic the script can be downloaded from My Oracle Support and when run against an Exalogic rack it will check the configuration of the rack. The running of exachk will create output files that detail any issues found with the rack.  To integrate with Enterprise Manager it is necessary to change the behaviour of exachk to output files in an XML format that can be parsed by the EM12c plugin and presented to the OMS server in a format that it can understand and present on screen.  To modify the behaviour simply set an environment variable prior to running the exachk script - export RAT_COPY_EM_XML_FILES=1. You can also use the RAT_OUTPUT=<output directory> to direct the output to a specific location. (The default behaviour is to put the output into the same directory as the exachk script is run from.

The recommendation for a virtual Exalogic is to run the exachk utility on the EMOC vServer.

To install the plugin simply ensure that the "Oracle Engineered System Healthchecks" plugin is downloaded and installed onto the OMS server and to the agent deployed to the EMOC server.  Then create the target as per the OVMM mechanism. The wizard for the healthcheck simply requests the directory on the server where the output will be read from and the frequency of checking for new versions of the exachk output. (Default is 31 days.)  Then setup the EMOC server to run the exachk on a regular basis.  The output becomes available via the EM12c console and hence can be made available to specific users who may not actually have access to the rack itself.