Planned Resources
Planned Resources
Big Brother Monitoring System
Big Brother Monitoring System
1Condor, Globus and SRB: Tools for 23Service Collects info about local resource
Constructing a Campus Grid. Jon Wakelin. Reports to GIIS server GIIS: Grid Index
2Overview. Condor Globus Storage Information Service Aggregates information
Resource Broker (SRB) UoBGrid Summary. 2. from GRIS servers One per organisation
3Condor Overview. High Throughput Same executable with different
Computing Environment From networked configuration. 23.
resources (Condor pool) Like other 24Storage Resource Broker (SRB)
Schedulers Queuing mechanism Overview. Uniform interface to
Prioritisation scheme Scheduling Policy heterogeneous data storage resources Unix,
Unlike other schedulers Doesn’t need Irix, linux file systems Windows Databases
dedicated resources Desktops workstations, Physical media (tape storage) SRB is
library or PC lab computers Cycle middleware Allows access to a wide range
scavenging. 3. of data resources Allows a wide range of
4Class Ads. Classified advertisements user Apps to be written All accessed
Machine Class Ads (for sale) Job Class Ads through a “narrow” API. Storage.
(wanted) Machine Class Ads Created from Applications. 24. API.
information “advertised” by machines in 25SRB Access. Applications Scommands:
the condor pool Can add extra Class Ad command line MySRB: Web access inQ:
information Job Class Ads Created from Windows GUI APIs Java, C, C++, Python,
information in the condor submit file Perl. 25.
Created from default values. 4. 26UoBGrid Overview. What is a Campus
5Different roles in a Condor pool. Grid? Our Situation Software Choices
Central Manager Submit Execute Or a Services. 26.
combination of these e.g. submit and 27What is a Campus Grid? A Grid: Single
execute node Different daemons will be sign-on to multiple resources located in
started depending on the role of the different administrative domains. 27.
machine. 5. 28Our Situation. Dedicated departmental
6Condor Daemons. All Machines clusters Windows Condor pools not a
condor_master - controls other daemons requested resource Separation of user
Central Manager condor_collector - communities parallel vs serial usage All
Collects information from other machines contained within a single firewall domain
condor_negotiator - Performs matchmaking Wanted to become partners in the NGS
Execute condor_startd - Starts, stops, Systems must be compatible Encourage our
suspends jobs Submit condor_schedd - users to become NGS users Full Economic
Maintains queue of jobs. 6. Costing coming soon! Important to keep
7Job Submission. Executable = /bin/ls usage records Ensure best usage of
Arguments = -l InitialDir = /usr/bin purchased resources for sustainable
Output = out Error = err Queue. 7. future. 28.
8Job Submission. Executable = /bin/ls 29Software Choices. Condor 6.6.7 Globus
Arguments = -l InitialDir = /usr/bin 2.4.3 MyProxy GSI-SSH Storage Resource
Output = out.$(Process) Error = Broker (SRB) Virtual Data Toolkit (VDT)
err.$(process) Queue 2. 8. Bundles many useful tools Platform
9Job Submission. Executable = /bin/ls independent installation Supported release
Arguments = -l InitialDir = /usr/bin of Globus Toolkit, MyProxy & GSI-SSH.
Output = out.$(Process) Error = 29.
err.$(process) Requirements = 30Planned Resources. 30.
((Arch==“INTEL” && OpSys=“LINUX”) 31Current Resources. 4 Servers RB:
|| (Arch==“INTEL” && Resource Broker VOM: Virtual Organisation
OpSys=“IRIX65”)) Queue 2. 9. Manager MDS: Monitoring and Discovery
10Job Submission. Executable = Service SRB: Storage Resource Broker 4
gaussian.$$(Arch).$$(OpSys) InitialDir = Compute Resources Monster2 - SGE, 20 CPU
/home/jon Input = chlorobenzene.in Output Tuya - PBS, 16 CPU Grendel - PBS, 110 CPU
= chlorobenzene.$(Process) Error = BSESrv1 - PBS, 28 CPU. 31.
chlorobenzene.$(process) Requirements = 32Resource Broker. Condor-G with
((Arch==“INTEL” && OpSys=“LINUX”) matchmaking Custom script for
|| (Arch==“INTEL” && determination of resource status Converts
OpSys=“IRIX65”)) Queue 2. 10. MDS information into condor Class Ads Adds
11Condor Commands. condor_submit information about available software User
<submit_file> condor_q condor_rm submission script Create condor submit
condor_status Displays pool status in a file Software requirements passed into
succinct format condor_status –l Condor submit file Submits jobs Sends data
<machine> Display full Class Ad SRB
information. 11. http://cerb-rb.bris.ac.uk/cgi-bin/rb_statu
12Condor-G. Condor interface to access .cgi. 32.
Globus resources condor submit file condor 33Virtual Organisation Manager. Built
commands Keeps log of runs Adds fault using Webserver Apache + mod_ssl Perl CGI
tolerance Can be used to perform Postgres Database Modified Globus
matchmaking Must create machine Class Ads JobManagers Functionality Record of users
manually condor_advertise command Can be and machines Administrative functions
used to create a resource broker No RB Accounting/Usage Statistics. 33.
functionality in Globus Toolkit. 12. 34Virtual Organisation Manager. Admin –
13Globus Toolkit Overview. Globus is a via web interface (https) Access based on
toolkit not an turnkey solution Globus Certificate/DN Add/remove Users Add/Remove
Toolkit 2.4.3 common choice for production Resource Control Users Access to Resources
grids Four main components Authentication Constructs grid-mapfiles for all resources
(GSI) Resource management (GRAM) Data https://cerb-vom.bris.ac.uk/vom-bin/VOM.cg
transfer (GridFTP) Resource discovery and . 34.
monitoring (MDS). 13. 35Virtual Organisation Manager.
14Authentication. Grid users need to Accounting/Usage Statistics Usage by
obtain something called a certificate machine Usage by users Modified GRAM
Applications can use the certificate to JobManagers Job details sent to DB on
establish the identity of the user…. i.e. completion executable, arguments, start
authenticate the user. 14. time, end time, CPU, wall time, memory,
15PKI Authentication. Public Key virtual memory, jobmanager-type, number of
Infrastructure Public/Private keys Used to nodes
encrypt data And to sign certificates http://cerb-vom.bris.ac.uk/cgi-bin/VOM-usa
Certification Authority (CA) User create e-stats.cgi. 35.
certificate CA Signs certificates UK 36Resource Monitor. Runs GIIS Collects
eScience CA at RAL Certificate Contains information from UoBGrid resources Runs
Identity/Distinguished Name (DN) Public Big Brother monitoring software
Key signature & Identity of CA. 15. Client/Server model Server pings
16GSI Authentication. Grid Security registered resources Client records local
Infrastructure extensions to PKI (X509, system info and reports to server. 36.
SSL extensions) Single sign-on Delegation 37Big Brother Monitoring System. Web
grid-proxy-init – command to create proxy available status page with easy to
certificates. 16. understand functionality for helpdesk and
17Resource Management (GRAM). Grid admin staff. 37.
Resource Allocation Manager Gatekeeper 38Storage Resource Broker. All UobGrid
Resource Specification Language users given SRB account GSI authentication
JobManagers. 17. enabled for Scommands Access via
18GRAM - Gatekeeper. Daemon runs on a certificate. 38.
grid resource Processes incoming globus 39UoB Grid. 39.
requests Authenticates Users Configured to 40Compute resources running GRIS report
trust a given CA e.g. UK eScience CA at to information servers. 40.
RAL Maps user to local account DN => 41Resource Broker polls information
username grid-mapfile Passes the job onto servers and converts MDS information into
the jobmanager. 18. Condor Class Ads. 41.
19GRAM - RSL. Resource Specification 42User logs on to Resource Broker to
Language (attribute op value) in submit job. Jobs Are matched to resources
parenthesis Operators Numerical operators using condor. 42.
within clauses (<, <=, >, >=, 43Job details sent to machine by
=, !=) Logical operators between clauses Condor-G. 43.
(&, | ) Attributes Predefined 44Upon completion output files are sent
executable, arguments, stdin, stdout, back to the Resource broker. 44.
stderr, environment maxCpuTime, 45If job runs on UoB Grid resources run
maxWallTime, maxMemory, project, queue details are sent to VOM DB For NGS and UoB
User defined May be handled by subsequent users alike. 45.
application. 19. 46Finally output file are sent from RB
20GRAM - RSL. to the Storage Resource Broker. 46.
&(executable=“/bin/ls”) 47Summary. Condor Standalone: high
(arguments=“-l”) (directory=“/usr/bin”). throughput computing system Matchmaking
20. with Class Ads Condor-G: interface to
21GRAM - JobManagers. Perl modules Globus Toolkit Globus Toolkit
Convert RSL into scheduler specific Applications, Protocols, APIs GSI –
language Reference implementations Fork, Certificates (DN, public key, digital
Condor, PBS, LSF May need to roll-your-own signature) UoBGrid Centralised access to
e.g. LoadLeveller, SGE Or just to add disparate resources Custom components
extra functionality. 21. created to fill functionality gaps Globus
22Data Management (GridFTP). File gives authenticated access to resources
Transfer Protocol Extension of the Condor provides matchmaking (i.e.
standard FTP protocol stack to include brokering) SRB provides storage. 47.
extra functionality GSI authentication 48Useful URLs. SRB:
Third Party transfers Striped transfers http://www.sdsc.edu/srb/ Globus:
User application is globus-url-copy. 22. http://www.globus.org/ Condor:
23Information Services (MDS). Collect http://www.cs.wisc.edu/condor/ UK eScience
and provide status information about Grid CA: https://ca.grid-support.ac.uk/ NGS:
resources MDS: Monitoring and Discovery http://www.ngs.ac.uk/ UoBGrid:
Service GRIS: Grid Resource Information http://escience.bris.ac.uk. 48.
