





|

|
Abstract
NDMJOB is a NDMP-compatible backup/restore package, reference implementation, and conformance
test. It is furnished in source code form to the public by Traakan,
Inc. and other contributors free of charge or restriction. NDMP
(Network Data Management Protocol) is an open protocol for network
based backup enabling multi-vendor backup solutions. NDMJOB may
be used as a ready-to-run backup package, used in conjunction
with other NDMP products, as the basis for new products, as a
verification of NDMP products, and as a reference of the NDMP
protocol standards and conventions.
This NDMJOB/NDMJOBLIB
Technical Manual explains the software and discusses NDMP
in general. This software is more library than command (the command
interface is about 5% of the source code), and readily adapted
to other NDMP products. The audiences for this manual include:
- developers interested
this software or NDMP in general;
- developers of other
NDMP products using NDMJOB as a testing and diagnostic tool,
and thus needing technical insight into NDMJOB;
- developers using
NDMJOBLIB as a foundation for other applications;
- contributors to
NDMJOB who enhance and extend its capabilities; and
- retargetters who
implement NDMJOB on additional operating systems.
The topics addressed include
- the elements, interactions,
and detailed operations of the NDMP architecture;
- the construction
of NDMJOB/NDMJOBLIB software;
- the NDMJOBLIB address
of vagueries in the NDMP specification; and
- guidance for porting
(retargetting) NDMJOB to other operating systems.
|
| http://www.traakan.com/ndmjob
|
NDMJOB latest source
and docs |
| http://www.ndmp.org
|
NDMP protocol status
and info |
| ndmp-tech@ndmp.org
|
Questions, comments
and bug reports. |
Table of Contents
Introduction
What are NDMJOB and NDMJOBLIB
Adaptability to Applications
Background, Purpose, and Motivation
Scope and Intended Audience
Terminology - "Agents" vs "Client/Server"
Other Documentation
NDMP Architectural
Overview
NDMP Agents and Elements
CONTROL Agent (Client)
Disk Files
DATA Agent (Server)
Image Stream
TAPE Agent (Server)
Tape Drive and Robot
ROBOT Agent (Server)
NDMP Data Flow and Media Model
NDMP Sequence for Connect and Authentication
NDMP Sequence for Backups
NDMP Sequence for Recovery
Run time hierarchy
Job
Session
Agent
Activity
Implementation
Overview
NDMJOB Command
ndmjob.h - header for NDMJOB
command
ndmjob-args - example argument
macros
ndmjob_args.c - process
command arguments
ndmjob_job.c - construct
ndm_job_param from args
ndmjob_main.c - main()
routine
ndmjob_rules.c - apply
RULES to a job
Rules
ndmjr_none.h/.c - no additional
rules
Agents
Agents Implementation Method
ndmxx_initialize()
ndmxx_commission()
ndmxx_quantum()
ndmxx_decomission()
Agents General Files
ndmagents.h -
header file for Agents
ndma_dispatch.c
- dispatch NDMP requests
ndma_job.c - audit
and perform ndm_job_param
ndma_image_stream.c
- glue between DATA/TAPE
ndma_session.c
- session orchestration
ndma_subr.c -
Agents subroutines
Control Agent - General and
Support
ndma_control.c
- Dispatch job
ndma_ctrl_calls.c
- issue NDMP requests to appropriate agent
ndma_ctrl_conn.c
- XDR/TCP/IP setup and management
ndma_ctrl_media.c
- media positioning and results
ndma_ctrl_robot.c
- robotic support for session
Control Agent - Normal Operations
ndma_cops_backreco.c
- backup/recover
ndma_cops_labels.c
- initialize/query tape labels
ndma_cops_query.c
- query agents capabilities
ndma_cops_robot.c
- robot fixer and subroutines
Control Agent - Test/Diagnostic
Operations
ndma_ctst_mover.c
- test NDMP MOVER behaviour
ndma_ctst_subr.c
- test subroutines
ndma_ctst_tape.c
- test NDMP TAPE behaviour
Data Agent
ndma_data.c
ndma_data_fh.c
ndma_data_gtar.c
ndma_data_pfe.c
Tape Agent
ndma_tape.c
ndma_tape_simulator.c
Robot Agent
Utility Routines
ndmlib.h
ndml_agent.c - Agent spec:
host,logon,pw
ndml_chan.c - Channel, non-blocking
I/O
ndml_conn.c - XDR Connection
ndml_cstr.c - Canonical strings
ala HTTP
ndml_log.c - Logging helper
functions
ndml_media.c - Media spec
LABEL/SIZE+FM@ADDR
ndml_nmb.c - NDMP Message
Buf helpers
ndml_scsi.c
ndml_util.c
NDMP Protocol Support
ndmprotocol.h
ndmprotocol.c - Protocol
layer accessor routines
ndmp_ammend.h - Definitions
needed in spec
ndmp_translate.h/.c -
vX/v9 translators
ndmp[0239].x - Protocol
specification files
ndmp[0239].h - rpcgen(1)
generated header files
ndmp[0239]_xdr.c - rpcgen(1)
generated XDR routines
ndmp[023]_xmt.c - XDR
message dispatch tables
ndmp[0230]_enum_strs.h/.c
- enum/str for pp
ndmp[023]_pp.c - Pretty
Printers, used for snoop
SCSI Media Changer (smc)
scsiconst.h
smc.h
smc_api.c
smc_parse.c
smc_pp.c
smc_priv.h
smc_raw.h
TAR Format Support Routines
tarhdr_gnu.h
tarsnoop.c and tarsnoop.h
Operating System Specific
ndmos.h
ndmos.c
ndmos_freebsd.h and ndmos_freebsd.c
ndmos_solaris.h and ndmos_solaris.c
ndmos_xxx.h and ndmos_xxx.c
templates
About Recovery
Basic recovery steps
Aquisition
Disposition
Disposition
PASS
Disposition
DISCARD
Direct and sequential access
Environment variables controlling
access
DIRECT
RECOVER_DIRECT
NDMP ndmp_name structure
fh_info field
Known invalid
fh_info all 1s
RECOVER_FH_INFO_VALID
environment variable
RECOVER_PREFIX
environment variable
Direct/sequential vs. valid/invalid
fh_info
Recovery modes
Sequential mode
Semi-direct mode
Direct mode
Mixed mode
Porting
Tips
Introduction
What are NDMJOB and NDMJOBLIB
NDMJOB and NDMJOBLIB comprise
a software package which manages tape backup and recover operations using
NDMP (Network Data Management Protocol).
NDMJOB/NDMJOBLIB is more
library than command, and this library may serve as the foundation for
more elaborate CONTROL packages and/or better user interfaces. For the
purposes of this manual, NDMJOB precisely refers to the thin layer of
code -- about 5% -- which provide the command-line style user interface
to NDMJOBLIB. NDMJOBLIB refers to the rest of the software and does
all the real work.
NDMJOBLIB and NDMP in general
are a frameworks for managing backups and not a backup method. There
is no NDMJOB tape format, for example. Rather, NDMJOBLIB is a "wrapper"
around programs which do the real work, and provides an interoperable
way to orchestrate backup/recover operations.
Adaptability to Applications
The NDMP behaviour and conventions
are isolated in operating system independent (O/S generic) modules. Operating
system specific portions, such as device access and manipulation, are
isolated in per-system modules. The O/S specific portions for any system
represent about 4% of the source code.
Application sensitive aspects
of NDMJOBLIB are isolated using C preprocessor (cpp(1)) #ifdef's. By
default, NDMJOB/NDMJOBLIB builds with all abilities and all NDMP versions
supported. Specific portions may be omitted by defining the right cpp(1)
symbol.
For example, server-only
applications may want to shed the NDMJOBLIB control capabilities, while
client-only applications may want to only keep the control capabilities.
NDMJOBLIB supports both
NDMPv2 and NDMPv3 simultaneously, and can coordinate the activities
of two agents with different versions. Again, the right cpp(1) symbol
can omit support for specific versions.
Background, Purpose, and
Motivation
NDMJOB was originally developed
by Traakan, Inc as a testing tool. It gradually became more sophisticatd
to test tape robotics, and directed access recovery. Further, it became
a vehicle for identifying and resolving ambiguities in the NDMP specification.
As NDMJOB evolved, it became a generally useful utility as well as a testing
tool. Traakan released NDMJOB into the public domain in the spirit of
fostering interoperability, and fostering the deployment of NDMP-based
backup systems.
Please e-mail comments and
contributions to ndmjob@traakan.com and/or ndmp-tech@ndmp.org.
Scope and Intended Audience
The audiences for this manual
include:
- developers interested
this software or NDMP in general;
- developers of other NDMP
products using NDMJOB as a testing and diagnostic tool, and thus needing
technical insight into NDMJOB;
- developers using NDMJOBLIB
as a foundation for other applications;
- contributors to NDMJOB
who enhance and extend its capabilities; and
- retargetters who implement
NDMJOB on additional operating systems.
Terminology - "Agents" vs
"Client/Server"
The NDMJOB/NDMJOBLIB documentation
and code refer to "Agents", while the NDMP specification refers to "Clients"
and "Servers". The client/server model -- or duality -- is the most commonly
used for networking protocols. Peer-to-peer and agents are other terms
frequently used.
If you say "NDMP Client"
to a mixed audience of backup folks, half of them are thinking the same
thing you are and the other half aren't. The conversation invariably
comes to a halt while everybody syncs up on terminology. If you say
"NDMP Control agent" to an audience, everybody knows what you mean.
As NDMJOB and NDMP are introduced into more elaborate packages -- and
the client/server duality becomes difficult to apply -- the "agent"
terminology will continue to work.
The NDMP specification refers
to client/server. This is heavily influenced by the direction of connection.
The client-side initiates activity, connects to one or more servers,
the uses the NDMP to coordinate activities. To people accustomed to
working with networking protocols, this terminology is crystal clear
and easily recognized: the elements which "connect" are clients, and
the elements which "listen" are the servers. Yet, ignore how connections
are established for a moment, and for NDMP the recognition gets more
difficult.
The promotional materials
and personel of commercial backup packages also refer to client/server,
and the meaning is reversed from that of the NDMP specification. This
has an interesting history which we won't explore here. To these folks
the servers are the elements closest to objective, and the clients are
elements which interact and support the server. The objective, again
to these folks, is the management of backup schedules, file indexes,
recovery request queues, media aging and reassigment, etc, etc, etc.
A tape drive, or anything that helps talk to it, is a supportive role
to the server.
The client/server duality
breaks down for backup packages precisely because backup packages have
more than two elements. Although NDMJOB is a stand-alone CONTROL agent,
the CONTROL agent for commercial backup packages is encapsulated within
standing processes ("servers") which await connections from administrative
and user interfaces, and peer processes running the same package on
other machines. At appropriate times, these packages initiate and orchestrate
backup activity. It's easy to imagine backup packages with three, four,
or even five key elements. Architectures and systems with five elements
defy traditional client/server duality.
NDMJOB introduces its own
wrinkle with its "resident" agents. A resident agent does not involve
connecting and listening, nor any network connection at all. Resident
agents are accessed as a subroutine package.
The term "agent" is used
for SNMP (Simple Network Management Protocol) because large SNMP systems
also have more than two key elements. The issues surrounding NDMP and
backup packages are similar to those of SNMP, so we adopted the "agents"
terminology rather than inventing a new one. At times, client/server
is a handy short hand for what's connecting and what's listening.
Other Documentation
The NDMPv2 and NDMPv3 specifications,
and the Workflow documents which describe NDMP sequences, may be found
on the http://www.ndmp.org/ NDMP web
site.
The NDMJOB User Manual
explains the capabilities, command usage and options, and provides examples
and guidance.
NDMP Architectural Overview
NDMP Agents and Elements
CONTROL Agent (Client)
The CONTROL agent orchestrates
the activity of the DATA, TAPE, and ROBOT agents. Among other things,
the CONTROL agent:
- Implements all user
interfaces
- Maintains a file index
of all backups
- Initiates backup activity
- Initiates recovery (some
files) and restore (all files) activities
- Implements tape robot
control via SCSI pass-through
- Maintains a media (tape)
index
Disk Files
DATA Agent (Server)
The DATA agent constructs
and parses a backup image stream from or to a collection of disk files,
respectively. Think of the data agent as your favorite archive program,
like tar or zip, but rather than save the data to a file (or to tape),
it delivers the data to a data connection.
Image Stream
TAPE Agent (Server)
The TAPE agent stores or retrieves
a backup image stream to or from one or more tape files, respectively.
Tape Drive and Robot
ROBOT Agent (Server)
NDMP Data Flow and Media
Model
NDMP Sequence for Connect
and Authentication
| Item |
Description |
| 1 |
A standing NDMP daemon awaits connection requests on TCP port
number 10000. |
| 2 |
When a connection is received, the NDMP daemon launches an NDMP
session. More than one NDMP session may be active at a time, though
the NDMP sessions may not share tape devices. |
| 3 |
An NDMP session control block is constructed by the NDMP session
process. This contains all state information for the various components.
|
| 4 |
The NDMP session sends an NDMP_NOTIFY_CONNECTED message to the
NDMP CONTROL agent (client), thus indicating that initialization
is complete and the session may proceed. |
| 5 |
The CONTROL agent negotiates a protocol version using NDMP_CONNECT_OPEN |
| 6 |
The CONTROL agent requests a list of acceptable authentication
methods. |
| 7 |
The client presents the password to authenticate itself as having
authority to perform the (priviledged) backup/recover operations.
|
NDMP Sequence for Backups
| Item |
Description |
| 1 |
The CONTROL agent, using the SCSI pass-through, initiates tape
robot activity. If a tape robot is not being used, the CONTROL
agent requests manual intervention. |
| 2 |
The CONTROL agent requests the TAPE agent open of the appropriate
tape drive. |
| 3 |
Using the NDMP_TAPE interfaces, the CONTROL agent checks and
positions the tape drive. Some CONTROL agents will write an identifying
label, necessarily as a separate tape file, onto the tape. |
| 4 |
The CONTROL agent requests the TAPE agent commence NDMP_MOVER
functions start using NDMP_MOVER_LISTEN. The NDMP_MOVER is an
interface between the image stream and the tape media. |
| 5 |
The CONTROL agent requests the DATA agent start using NDMP_DATA_START_BACKUP.
The file set, NDMP_MOVER address (local or remote), and other
parameters to govern the backup are included in the request. |
| 6 |
The DATA agent connects to the TAPE agent, and commences delivery
of the backup image stream. |
| 7 |
The MOVER receives the connection, and commences blocking and
writing the image stream to the tape. |
| 8 |
As the DATA agent processes files, it delivers NDMP_FH (file
history) information to the CONTROL agent. This is used by the
CONTROL agent to construct a file index of the backup. |
| 9 |
Periodically, the CONTROL agent will request status information
for the NDMP_DATA, NDMP_TAPE, and NDMP_MOVER. |
| 10 |
When the TAPE agent reaches the end of the tape, it notifies
the CONTROL agent using an NDMP_NOTIFY_MOVER_PAUSED message. The
NDMP_MOVER then suspends until directed to continue (step 14).
|
| 11 |
The CONTROL agent, using the NDMP_TAPE interfaces, directs the
TAPE agent to write file marks, rewind the tape drive, and unload
the tape. |
| 12 |
The CONTROL agent moves tapes around using the SCSI pass-through
to operate the tape robot. If there is no tape robot, the CONTROL
agent requests manual intervention. |
| 13 |
As with step 3, the CONTROL agent positions the tape and writes
identifying data. |
| 14 |
The CONTROL agent directs the NDMP_MOVER resume using the NDMP_MOVER_CONTINUE
request. |
| 15 |
The TAPE agent continues to write the image stream to the tape
drive. |
| 16 |
When the backup is complete, the DATA agent enters the HALT
state with the SUCCESSFUL reason, and notifies the CONTROL agent
with a NDMP_NOTIFY_DATA_HALTED message. The image stream connection
is closed. |
| 17 |
The CONTROL agent issues a NDMP_DATA_STOP request. The DATA
agent terminates. |
| 18 |
The TAPE agent detects that the image stream is closed. |
| 19 |
The TAPE agent flushes all pending writes to the tape drive. |
| 20 |
The TAPE agent notifies the CONTROL agent that it is done with
a NDMP_NOTIFY_MOVER_HALTED message. |
| 21 |
The CONTROL agent issues a NDMP_MOVER_STOP request, which causes
the NDMP_MOVER functions to terminate. |
| 22 |
Using the NDMP_TAPE interfaces, the CONTROL agent writes file
marks to the tape drive, rewinds, and unloads |
| 23 |
The CONTROL agent, using the SCSI pass-through, operates the
tape robot to store the tape. |
| 24 |
The CONTROL agent issues and NDMP_CONNECT_CLOSE request(s).
Both the CONTROL agent and the session(s) close the TCP connection.
|
NDMP Sequence for Recovery
| Item |
Description |
| 1 |
The CLIENT, using the SCSI pass-through, initiates tape robot
activity. If a tape robot is not begin used, the CLIENT requests
manual intervention. |
| 2 |
The CLIENT requests opening of the appropriate tape drive via
NDMP_TAPE_OPEN. |
| 3 |
Using the NDMP_TAPE interfaces, the CLIENT checks and positions
the tape drive. |
| 4 |
The CLIENT requests the MOVER thread start using NDMP_MOVER_LISTEN.
|
| 5 |
The CLIENT requests the DATA thread start using NDMP_DATA_START_RESTORE.
|
| 6 |
The DATA thread analyzes the parameters, and issues a tape seek
and read using NDMP_NOTIFY_DATA_READ. The CLIENT may unload and
reload the tape drive. |
| 7 |
The CLIENT sends an NDMP_MOVER_SET_WINDOW request, which establishes
the relative position of the tape file to the original data stream.
Then, the CLIENT sends and NDMP_MOVER_READ request with the same
parameters as the NDMP_NOTIFY_DATA_READ. The MOVER thread is responsible
for resolving the window with the data-stream relative offsets.
|
| 8 |
The MOVER thread begins reading the tape drive. |
| 9 |
The MOVER writes the data stream, and the DATA thread reads
it. |
| 10 |
As files are restored, the DATA sends NDMP_LOG_FILE messages
identifying what was recovered. Only names that match the original
request (step #5) are reported. File and subdirectories are not
individually reported. |
| 11 |
When the current tape file is exhausted, the MOVER enters the
paused state and informs the CLIENT of such with NDMP_NOTIFY_MOVER_PAUSED.
|
| 12 |
The CLIENT, using the NDMP_TAPE interfaces, rewinds the tape
drive (or possibly possitions to another tape file). |
| 13 |
The CLIENT requests the tape drive be closed using NDMP_TAPE_CLOSE.
|
| 14 |
The CLIENT initiates tape robot activity to change tapes. |
| 15 |
The CLIENT re-opens the tape drive. |
| 16 |
The CLIENT repositions the tape. |
| 17 |
The CLIENT sends an NDMP_MOVER_CONTINUE request, and the MOVER
reenters the active state. It continues to read data from the
tape and writing it to the data connection. |
| 18 |
When the DATA thread has recovered all files (or exhausted the
backup), it enters the halted state and informs the CLIENT of
such using NDMP_NOTIFY_DATA_HALTED. |
| 19 |
The CLIENT issues a NDMP_DATA_STOP, which causes the DATA thread
to terminate. The data connection is closed. |
| 20 |
The MOVER notices the closed connection, enters the halt state,
and informs the CLIENT of such with NDMP_NOTIFY_MOVER_HALTED.
|
| 21 |
The CLIENT issues an NDMP_MOVER_STOP request, and the MOVER
thread terminates. |
| 22 |
Using the NDMP_TAPE interfaces, the CLIENT rewinds the tape
drive. |
| 23 |
The CLIENT uses NDMP_TAPE_CLOSE to close the tape drive. |
| 24 |
The CLIENT uses the SCSI pass-through to store the tape. |
Run time hierarchy
|
(from ndmagents.h)
"job" -> ndma_client_session() ndma_server_session()
| |
/-------------/ Q
| v
| +------------------------------------+ +-----------+
| /->| SESSION QUANTUM |----->| disp conn |
| | +------------------------------------+ \ +-----------+
| Q Q Q Q Q Q | v |
| | | | | | | | +----------+ |
| /-----|----+--|----+----|----+--|----+---|----| dispatch | |
| | | | | | | | | | | | | | request | |
| v | v v v v v v v v v | +----------+ |
| +-------+ +----+ +------+ +----+ +-----+ | ^ |
+>|CONTROL| |DATA| |IMAGE | |TAPE| |ROBOT| | | |
| | | |->|STREAM|<-| | | | | | |
| | | *====* *====* | | | | | ndmconn_recv()
| ndmca | ndmda| |ndmis | ndmta| |ndmra| | | |
+-------+ +----+ +------+ +----+ +-----+ | |resi |
| | | | | | | +------+ |
\-----|-+----|----+-------+-------------->| call | |
| | | +------+ |
formatter| |image_stream | |remo |
v v | v v
+---------+<---ndmchan_poll()----/ +---------+
| ndmchan |<---------------------------| ndmconn |
+---------+ +---------+
non-blocking I/O XDR wrapper
-----> caller/callee
--Q--> quantum (CPU scheduling)
====== image stream shared data structures
|
Job
Session
Agent
Activity
Implementation Overview
|
(from Makefile)
This illustrates the strata (layers) of the
NDMJOB/NDMJOBLIB software, the scope of key
header (.h) files, and the source files
constituting each layer.
- - - - - +---------------------------------------+
^ ^ ^ ^ ndmjob.h | NDMJOB Command ndmjob_*.c |
| | | | - +---------------------------------------+
| | | | NDMJOBLIB API "job"
| | | | +---------------------------------------+ \
| | | | | Rules (NDMJLR) ndmjr_*.[ch] | \
| | | ndmagents.h | Agents (NDMJLA) ndma_*.c | |
| | | - +---------------------------------------+ |
| | ndmlib.h | Library (NDMJLL) ndml_*.c | |
| | - +---------------------------------------+ |
| ndmprotocol.h | Protocol (NDMJLP) ndmp*.[chx] | NDMJOBLIB
| - +---------------------------------------+ |
| | SMC (NDMJLS) smc*.[ch] | |
| | Formats (NDMJLF) tar*.[ch] | |
| +---------------------------------------+ |
ndmos.h | OS intf (NDMJLO) ndmos*.[ch] | /
- +---------------------------------------+ /
|
NDMJOB Command
ndmjob.h - header for
NDMJOB command
The primary header file for
the NDMJOB command line interface. All global variables for the command
are defined here. Most of the global variables directly correspond to
command line options.
ndmjob-args - example
argument macros
Sample arg macros file for
NDMJOB.
ndmjob_args.c - process
command arguments
NDMJOB command line argument
processing
ndmjob_job.c - construct
ndm_job_param from args
NDMJOB synthesis of structure
used to enter the NDMAGENTS library.
ndmjob_main.c - main()
routine
NDMJOB main() routines. Handles
log files, debug levels, etc.
ndmjob_rules.c - apply
RULES to a job
Rules
ndmjr_none.h/.c - no additional
rules
These contain the rule set
for -o rules=none. They are here primarily as templates.
Agents
Agents Implementation
Method
ndmxx_initialize()
ndmxx_commission()
ndmxx_quantum()
ndmxx_decomission()
Agents General Files
ndmagents.h - header
file for Agents
ALL AGENTS header file for
the NDMAGENTS library.
ndma_dispatch.c - dispatch
NDMP requests
ALL AGENTS dispatch routines
for NDMP messages. Most audits (error/status checking) are done here.
ndma_job.c - audit and
perform ndm_job_param
CONTROL AGENT API into NDMAGENT
library
ndma_image_stream.c
- glue between DATA/TAPE
DATA/TAPE image stream subroutines.
The image stream is the data stream between the DATA and TAPE Agents.
ndma_session.c - session
orchestration
ALL AGENTS session management.
Primarily processes connections and dispatches quantums.
ndma_subr.c - Agents
subroutines
ALL AGENTS various subroutines.
Control Agent - General
and Support
ndma_control.c - Dispatch
job
ndma_ctrl_calls.c -
issue NDMP requests to appropriate agent
CONTROL AGENT call interfaces
ndma_ctrl_conn.c - XDR/TCP/IP
setup and management
CONTROL AGENT connection
management for connections to the other (DATA/TAPE/ROBOT) Agents.
ndma_ctrl_media.c -
media positioning and results
CONTROL AGENT media management.
Tape labels, robotics, tape positioning, window sizes and window capture.
ndma_ctrl_robot.c -
robotic support for session
CONTROL AGENT tape robotics
subroutines
Control Agent - Normal
Operations
ndma_cops_backreco.c
- backup/recover
CONTROL AGENT OPERATIONS
for backup/recover. This is complete.
ndma_cops_labels.c -
initialize/query tape labels
CONTROL AGENT OPERATIONS
for tape labeling and label queries
ndma_cops_query.c -
query agents capabilities
CONTROL AGENT OPERATIONS
for NDMP server queries
ndma_cops_robot.c -
robot fixer and subroutines
CONTROL AGENT OPERATIONS
for tape robotics operations, primarily subroutines for the other
ndma_cops_xxx.c files.
Control Agent - Test/Diagnostic
Operations
|
(from ndma_ctst_mover.c)
NDMP Elements of a test-mover session
+-----+ ###########
| Job |----># CONTROL #
+-----+ # Agent #
# #
###########
# | |
#=============# | +---------------------+
# | |
CONTROL # control | connections |
impersonates # V V
DATA side of # ############ +-------+ #########
image stream # # TAPE # | | # ROBOT #
# # Agent # | ROBOT |<-># Agent #
# image # +------+ # |+-----+| # #
#==============|mover |=====|DRIVE|| # #
stream # +------+ # |+-----+| # #
############ +-------+ #########
|
ndma_ctst_mover.c -
test NDMP MOVER behaviour
CONTROL AGENT TEST OPERATIONs
for testing the NDMP MOVER component of a TAPE Agent.
ndma_ctst_subr.c - test
subroutines
CONTROL AGENT TEST OPERATIONS
subroutines for the other ndma_ctst_xxx.c files.
ndma_ctst_tape.c - test
NDMP TAPE behaviour
CONTROL AGENT TEST OPERATIONs
for testing the NDMP TAPE component of a TAPE Agent. This does a lot
of tape positioning and verification of status/error codes for certain
conditions.
Data Agent
ndma_data.c
DATA AGENT primary implementation.
Constructors, destructors, semantic actions, and quantum processing.
ndma_data_fh.c
DATA AGENT file history
(FH) support routines.
ndma_data_gtar.c
DATA AGENT Gnu tar(1) interface.
ndma_data_pfe.c
DATA AGENT pipe/fork/exec
subroutines.
Tape Agent
ndma_tape.c
TAPE AGENT primary implementation.
Constructors, destructors, semantic actions, and quantum processing.
ndma_tape_simulator.c
TAPE AGENT simulator of
a properly functioning tape drive and driver. This uses a disk file.
It can be used as a reference for implementing other tape drive/driver
interfaces, which are necessarily OS DEPENDENT.
Robot Agent
Utility Routines
ndmlib.h
NDMLIB header file
ndml_agent.c - Agent spec:
host,logon,pw
NDMLIB library routine for
processing Agent identification and authentication (host name, password,
etc). This is used by NDMJOB to assist command line processing.
ndml_chan.c - Channel,
non-blocking I/O
NDMLIB channel functions.
A channel is simply a recuring I/O operation.
ndml_conn.c - XDR Connection
NDMLIB connection functions.
Connections are channels (network connections) between NDMP Agents.
ndml_cstr.c - Canonical
strings ala HTTP
NDMLIB connection functions.
Connections are channels (network connections) between NDMP Agents.
ndml_log.c - Logging helper
functions
NDMLIB log helper functions.
ndml_media.c - Media spec
LABEL/SIZE+FM@ADDR
NDMLIB media helper functions.
Primarily command line argument processing helpers.
ndml_nmb.c - NDMP Message
Buf helpers
NDMLIB NDMP Message Buffer
(NMB) support routines.
ndml_scsi.c
NDMLIB SCSI interfaces. This
is complete, but there is some cleanup work to do. That's why it hasn't
been renamed to ndml_scsi.c
ndml_util.c
NDMLIB utility functions
NDMP Protocol Support
There are multiple version of
NDMP. This gathers them together. Under control of #ifdef NDMOS_OPTION_NO_NDMPx
specific versions may be omitted. At this time, NDMPv2 and NDMPv3 are
deployed. NDMPv1 was defined but not widely deployed, and deemed irrelavent.
NDMPv4 is under consideration.
NDMP is defined using RPC
protocol specification files (.x files). NDMP does not really use the
RPC layer, but it does use the RPC XDR (External Data Representation)
layer.
The original NDMP .x files
are cosmetically transformed for NDMJOBLIB. The original NDMPv2 and
NDMPv3 .x files use names like ndmp_name and ndmp_config_get_host_info_reply.
These changed between versions even though they have the same name.
Data structures which didn't change, like ndmp_pval, caused compile-time
agony. For example, xdr_ndmp_pval() would be multiply defined at ld(1)-time.
The first approach considered and rejected to resolve this was to make
a unified, all versions .x file. It was rejected because it becomes
difficult, even impractical, to integrate new versions and to omit old
ones. The approach taken was to transform the names to reflect protocol
version. This same approach was adopted by NFS for NFSv3 and NFSv4.
Now there is an ndmp2_pval and an ndmp3_pval, and the compiler is happy.
When it's defined, there will be an ndmp4_pval.
There are two pseudo-versions
of the protocol here: NDMPv0 and NDMPv9. These are used for internal
convenience. These are also defined using .x files because it's easy
to cut-n-paste from the official .x files. Neither NDMPv0 nor NDMPv9
may be omitted.
NDMPv0 is the NDMP protocol
subset used before the protocol version negotiation is complete. This
subset of the protocol must necessarily remain immutable and constant
for all time. NDMPv0 is the over-the-wire protocol until the version
is negotiated.
NDMPv9 is an internal representation
of the protocol and isolates higher layers of NDMJOB from most variations
between protocol version. NDMPv9 makes it a little easier to add new
versions and omit older ones. NDMPv9 is never used over-the-wire, and
therefor there are no XDR routines.
There are three primary
elements of this layer:
- Header files which define
each version of the protocol. These are generated from files (.x files)
by rpcgen(1).
- XDR routines which convert
to/from the over-the-wire protocol and internal data structures. These
are also generated by rpcgen(1). There are also tables of XDR routines.
- Support for pretty-printing
the protocol data structures. Maybe someday rpcgen(1) will generate
these, too.
ndmprotocol.h
This is the key #include file
for the NDMP protocol layer of NDMJOBLIB. It #include's the other header
files for this layer based on protocol version configuration preprocessor
symbols (NDMOS_OPTION_NO_NDMP2 and NDMOS_OPTION_NO_NDMP3).
ndmprotocol.c - Protocol
layer accessor routines
ndmp_ammend.h - Definitions
needed in spec
NDMP PROTOCOL ammendments.
Makes certain names follow logical rules. These rules are used in a
great many macros.
ndmp_translate.h/.c -
vX/v9 translators
ndmp[0239].x - Protocol
specification files
ndmp[0239].h - rpcgen(1)
generated header files
ndmp[0239]_xdr.c - rpcgen(1)
generated XDR routines
ndmp[023]_xmt.c - XDR
message dispatch tables
ndmp[0230]_enum_strs.h/.c
- enum/str for pp
ndmp[023]_pp.c - Pretty
Printers, used for snoop
SCSI Media Changer (smc)
scsiconst.h
SCSI constants used for the
tape robotics
smc.h
SCSI MEDIA CHANGER header
file
smc_api.c
SCSI MEDIA CHANGER library
API
smc_parse.c
SCSI MEDIA CHANGER parser
for the data returned by certain queries.
smc_pp.c
SCSI MEDIA CHANGER pretty-printer
(pp)
smc_priv.h
SCSI MEDIA CHANGER private
header file.
smc_raw.h
SCSI MEDIA CHANGER raw format
of query result data
TAR Format Support Routines
tarhdr_gnu.h
GNU Tar record structure header
file
tarsnoop.c and tarsnoop.h
General tar(1) data stream
snooper.
Operating System Specific
By Operating System (O/S) specific
we mean the programming environment including compilers, header files,
as well the host O/S APIs. O/S specific is clear and concise, so that's
how we refer to the hosting environment.
The ndmos.h file essentially
#include's the right ndmos_xxx.h for the hosting environment. The companion
source C files, ndmos_xxx*.c, are similarly selected by ndmos.c.
The strategy for separating
the O/S specific and O/S generic portions of NDMJOBLIB has four key
points:
- Isolate O/S specific
portions in separate files which can be developed, contributed, and
maintained independently of the overall source base.
- NEVER NEVER #ifdef based
on O/S or programming environment in the O/S generic portions. These
make collective maintenance and integration too difficult.
- Use O/S specific #define
macros (NDMOS_...) and C functions (ndmos_...) as wrappers around
the portions that vary between environments and applications.
- Use generic, objective-oriented
#ifdef's to isolate and omit functionality which may not be wanted
in all applications.
There are templates in ndmos_xxx.h
and ndmos_xxx.c to get started on a new O/S specific portion. Send contributions
to the current keeper of NDMJOB. Contact ndmp-tech@ndmp.org for details.
DO NOT MODIFY
ANY GENERIC PORTION OF NDMJOBLIB FOR THE SAKE OF A HOSTING ENVIRONMENT
OR APPLICATION.
If you discover additional isolation
requirements, raise the issue on ndmp-tech@ndmp.org. Propose new #define
NDMOS_ macros to address them. Then, submit the proposal with required
changes to the current keeper of NDMJOB. Changes to the generic portion
which use #ifdef's based on anything other than NDMOS_ macros will be
summarily rejected.
ndmos.h
There are four sections of
this file:
- Establish identities
for various O/S platforms
- Try to auto-recognize
the environment
- #include the right O/S
specific ndmos_xxx.h
- Establish default #define-itions
for macros left undefined
ndmos.c
This merely #include's the
right O/S specific C file based on the NDMOS_ID preprocessor symbol.
The O/S specific source file is contained in a file ndmos_xxx.c, where
xxx is the name for the programming environment.
ndmos_freebsd.h and ndmos_freebsd.c
The O/S specific files for
FreeBSD. As of this writting (NDMJOBLIB 1.1), it uses the tape simulator
(ndma_tape_simulator.c) and does not implement the SCSI pass-thru.
ndmos_solaris.h and ndmos_solaris.c
The O/S specific files for
Solaris. As of this writting (NDMJOBLIB 1.1), it uses the tape simulator
(ndma_tape_simulator.c) and does not implement the SCSI pass-thru.
ndmos_xxx.h and ndmos_xxx.c
templates
These are templates for creating
additional O/S specific portions. Copy to new files with xxx replaced
with a short name for the hosting programming environment, then follow
the directions in the comments. Please contribute your working module
to ndmjob@traakan.com.
About Recovery
A great deal of the NDMP merit,
and of the implementation difficulties, and of disperate NDMP implementations
centers around the recovery features. This sections discusses NDMJOB recovery
operations.
The NDMP recovery process
is about selecting objects from the image stream in as efficient a manner
as possible. Selected objects are passed to the formatter program for
processing (which usually means storing).
Basic recovery steps
Recovery can be viewed as two
steps for each object:
Aquisition
Pre-read enough of the object
in order to fully identify it and subsequently determine its disposition.
During the pre-read, the image stream is not passed to the formatter.
Pre-read data is held in the plumb.image channel buffer.
Disposition
Once enough of the object
has been pre-read, its disposition is decided. An object is either PASSED
or DISCARDED.
Disposition PASS
the pre-read portion plus
the rest of the object are passed to the formatter program via the
formatter_image channel. This simply requires copying the required
amount of data from the image channel buffer to the formatter_image
buffer.
Disposition DISCARD
pre-read portion plus the
rest of the object are simply consumed out of the image channel buffer.
The backup image is a sequence
of objects. Some objects are selected (PASSED), some are not (DISCARDED).
We expect objects to appear in consecutive groups of either selected or
not selected, thus creating a detectable "edge". This edge can trigger
certain optimizations. Detecting the edge is as simple as recognizing
when the current disposition is not the same as the previous disposition.
Direct and sequential access
The NDMP architecture allows
for two methods of access to the backup image during recovery: direct
and sequential.
Direct access allows for
the DATA agent to cue the TAPE agent for portions of the backup image.
This is done with the NDMP_NOTIFY_DATA_READ and NDMP_MOVER_READ interfaces.
The TAPE agent uses these cues to rapidly position the tape to the required
portion. CONTROL agent intervention is required for tape changes and
such.
There are times when direct
access is impossible. Two examples spring to mind. First, the NDMPCOPY
scenario, where one DATA agent is constructing a backup image, the image
is delivered to a second DATA agent, and it recovers the image to disk.
Second, when the CONTROL or TAPE agent does not support (implement)
the direct access features of NDMP.
Hence, there are times when
the entire backup image must be conveyed over the image stream and processed.
This is the sequential access method.
Environment variables
controlling access
DIRECT
The NDMPv3 spec mentions
an environment variable "DIRECT", which is either "yes" or "no". The
spec does not clearly state the semantics of this variable. Here,
it is defined. If "no", discreet NDMP_NOTIFY_DATA_READ requests may
not be issued. The DATA agent is expected to issue a single NDMP_NOTIFY_DATA_READ
with an offset of 0 and a length of infinity (all 1s). Such a request
is the strict definition of SEQUENTIAL access. If "DIRECT" is "yes",
the DATA agent MAY, though is not required to, use discreet NDMP_NOTIFY_DATA_READ
requests. The DIRECT variable has no implication to the fh_info fields
(see below).
RECOVER_DIRECT
Per postings to the ndmp-tech
e-mail list, the environment variable favored is "RECOVER_DIRECT",
rather than just "DIRECT". If RECOVER_DIRECT is not given, "DIRECT"
is checked. If neither is given, SEQUENTIAL access is used, as per
the NDMPv3 spec which says the default value of DIRECT is "n".
NDMP ndmp_name structure
The ndmp_name structure has
three important fields. The "name" field is the name of the file/object
as it occurs in the backup image. The "dest" field is the name as which
the object should be stored. The "fh_info" field is a 64-bit cookie generated
at the time the backup_image was constructed which is used to identify
the position of the object in the backup_image.
The NDMPv2 specication says
that the name field should be the path name of the file/object, relative
to the backup root, as it occurs in the backup. The name field participates
in the DISPOSITION phase of object processing (see BASIC STEPS above).
There is no requirement nor guidance in the NDMP specifications (v2
nor v3) whether a name implies selection of a single object or perhaps
a collection of objects. For example, if the named object is a directory,
should just the directory be recovered, or should all objects at or
below the directory be recovered? Conventional practice is that the
recovering DATA agent will answer the question in a manner it deems
natural. For "tar" format backups, as implemented here, a named directory
implies the directory and its contents. The name field is used as a
prefix match for objects in the stream. Objects which match are deemed
selected, and their DISPOSITION is PASS.
The dest field specifies
where the recovered object(s) are to be places. For "tar" format backups,
as implemented here, selected objects have the matching prefix substituted
by the dest field.
fh_info field
The fh_info field is said
to be an opaque object, with its contents only known to the relavent
DATA agents. In practice, it is a byte offset in the backup image. Some
NDMP implementation have problems if the fh_info field is otherwise.
The NDMP specifications (neither v2 nor v3) and protocol make no provisions
for identifying or recognizing the validity of the fh_info field. The
assumption is that if the DATA agent expects the fh_info field to be
valid, it MUST be valid. If the DATA agent does not support (implement)
the fh_info field, it is disregarded. This leads to a severe problem.
It means that for DATA agents which implement fh_info field, a recovery
request simply can not be processed without valid fh_info fields. There
is no way to cue the DATA agent to perform the recovery solely based
on the name and dest fields. The common answer is inadequate: use an
unspecified environment variable to indicate the status of the fh_info
field. This answer would lead to disperate practice for a fundamental
issue. Here, we establish a practice as proposed on the ndmp-tech e-mail
list.
Known invalid fh_info
all 1s
An fh_info field value of
all 1s indicates a known invalid fh_info.
RECOVER_FH_INFO_VALID
environment variable
The RECOVER_FH_INFO_VALID
environment variable is defined here. It has either "yes" or "no"
as its value. If "yes", the fh_info fields are considered valid, with
deference to the known invalid value (all 1s). If "no", all fh_info
fields are deemed invalid. If not given, the default is "yes".
RECOVER_PREFIX environment
variable
The RECOVER_PREFIX environment
variable is defined here. If given, the path names in the dest fields
are prepended with this value. The PREFIX environment variable plays
no role in recovery, and is considered merely informational about
when the backup image was created. If a dest field is not given (null
or empty), the name field is used as the dest, subject to the RECOVER_PREFIX
value.
Direct/sequential vs. valid/invalid
fh_info
Here comes the tricky part.
Does invalid fh_info preclude direct access? No. Does valid fh_info preclude
sequential access? No.
This table shows the mode
used depending on the DIRECT environment variable and the status of
the fh_info field.
DIRECT=yes DIRECT=no
========================================================
all fh_info valid direct sequential
all fh_info invalid semi-direct sequential
fh_info mixed (see text) sequential
========================================================
Recovery modes
Sequential mode
All objects are AQUIRED, and
extracted from the image stream. SEQUENTIAL access is initiated (NOTIFY_DATA_READ
0/inf). All DISPOSITION determined by name field match. PASSED data
is copied to the formatter program. DISCARDED data is consumed and not
passed.
Semi-direct mode
All objects are AQUIRED, and
selectively extracted from the image stream. Discreet NOTIFY_DATA_READ
requests used for AQUISITION. Once aquired, DISPOSITION is determined
based on name field match. PASSED data is requested by NOTIFY_DATA_READ,
and passed to formatter program. DISCARDED data is simply skipped by
omitting a corresponding NDMP_NOTIFY_DATA_READ request.
Direct mode
Objects selectively AQUIRED,
and selectively extracted from image stream. The ndmp_name entries in
the recovery request re processed in fh_info order. Objects initially
AQUIRED by direct access based on fh_info field, then semi-direct method
employed until an object with disposition DISCARD encountered. This
is the PASS->DISCARD edge. If the named object is a directory, then
all directory contents are PASSED until a non-matching name is encountered.
Then, the next ndmp_name entry is processed.
Mixed mode
The mixed mode is used when
some or all of the fh_info fields are known invalid. The semi-direct
mode is used until all ndmp_name entries with invalid fh_info values
are satisfied. The remaining ndmp_name entries are processed with the
direct mode.
Porting Tips
|
 |