Please don't forget to make a donation. We need your help in these difficult times. Donate now.

Document management system

keywords: document managing system
A document management system (DMS) is a computer system (or set of computer programs) used to track and store electronic documents. It is usually also capable of keeping track of the different versions modified by different users (history tracking). The term has some overlap with the concepts of content management systems. It is often viewed as a component of enterprise content management (ECM) systems and related to digital asset management, document imaging, workflow systems and records management systems.

History

Beginning in the 1980s, a number of vendors began developing software systems to manage paper-based documents. These systems dealt with paper documents, which included not only printed and published documents, but also photographs, prints, etc.
Later developers began to write a second type of system which could manage electronic documents, i.e., all those documents, or files, created on computers, and often stored on users' local file-systems. The earliest electronic document management (EDM) systems managed either proprietary file types, or a limited number of file formats. Many of these systems later[when?] became known as document imaging systems, because they focused on the capture, storage, indexing and retrieval of image file formats. EDM systems evolved to a point where systems could manage any type of file format that could be stored on the network. The applications grew to encompass electronic documents, collaboration tools, security, workflow, and auditing capabilities.
These systems enabled an organization to capture faxes and forms, to save copies of the documents as images, and to store the image files in the repository for security and quick retrieval (retrieval made possible because the system handled the extraction of the text from the document in the process of capture, and the text-indexer function provided text-retrieval capabilities).
While many EDM systems store documents in their native file format (Microsoft Word or Excel, PDF), some web-based document management systems are beginning to store content in the form of html. These policy management systems[1] require content to be imported into the system. However, once content is imported, the software acts like a search engine so users can find what they are looking for faster. The html format allows for better application of search capabilities such as full-text searching and stemming.[2]

Components

Document management systems commonly provide storage, versioning, metadata, security, as well as indexing and retrieval capabilities. Here is a description of these components:
Topic Description
Metadata Metadata is typically stored for each document. Metadata may, for example, include the date the document was stored and the identity of the user storing it. The DMS may also extract metadata from the document automatically or prompt the user to add metadata. Some systems also use optical character recognition on scanned images, or perform text extraction on electronic documents. The resulting extracted text can be used to assist users in locating documents by identifying probable keywords or providing for full text search capability, or can be used on its own. Extracted text can also be stored as a component of metadata, stored with the image, or separately as a source for searching document collections.
Integration Many document management systems attempt to integrate document management directly into other applications, so that users may retrieve existing documents directly from the document management system repository, make changes, and save the changed document back to the repository as a new version, all without leaving the application. Such integration is commonly available for office suites and e-mail or collaboration/groupware software. Integration often uses open standards such as ODMA, LDAP, WebDAV and SOAP to allow integration with other software and compliance with internal controls.[citation needed]
Capture Capture primarily involves accepting and processing images of paper documents from scanners or multifunction printers. Optical character recognition (OCR) software is often used, whether integrated into the hardware or as stand-alone software, in order to convert digital images into machine readable text. Optical mark recognition (OMR) software is sometimes used to extract values of check-boxes or bubbles. Capture may also involve accepting electronic documents and other computer-based files.
Indexing Indexing tracks electronic documents. Indexing may be as simple as keeping track of unique document identifiers; but often it takes a more complex form, providing classification through the documents' metadata or even through word indexes extracted from the documents' contents. Indexing exists mainly to support retrieval. One area of critical importance for rapid retrieval is the creation of an index topology.
Storage Store electronic documents. Storage of the documents often includes management of those same documents; where they are stored, for how long, migration of the documents from one storage media to another (hierarchical storage management) and eventual document destruction.
Retrieval Retrieve the electronic documents from the storage. Although the notion of retrieving a particular document is simple, retrieval in the electronic context can be quite complex and powerful. Simple retrieval of individual documents can be supported by allowing the user to specify the unique document identifier, and having the system use the basic index (or a non-indexed query on its data store) to retrieve the document. More flexible retrieval allows the user to specify partial search terms involving the document identifier and/or parts of the expected metadata. This would typically return a list of documents which match the user's search terms. Some systems provide the capability to specify a Boolean expression containing multiple keywords or example phrases expected to exist within the documents' contents. The retrieval for this kind of query may be supported by previously built indexes, or may perform more time-consuming searches through the documents' contents to return a list of the potentially relevant documents. See also Document retrieval.
Distribution A published document for distribution has to be in a format that can not be easily altered. As a common practice in law regulated industries, an original master copy of the document is usually never used for distribution other than archiving. If a document is to be distributed electronically in a regulatory environment, then the equipment tasking the job has to be quality endorsed AND validated. Similarly quality endorsed electronic distribution carriers have to be used. This approach applies to both of the systems by which the document is to be inter-exchanged, if the integrity of the document is highly in demand.
Security Document security is vital in many document management applications. Compliance requirements for certain documents can be quite complex depending on the type of documents. For instance, in the United States, the Health Insurance Portability and Accountability Act (HIPAA) requirements dictate that medical documents have certain security requirements. Some document management systems have a rights management module that allows an administrator to give access to documents based on type to only certain people or groups of people. Document marking at the time of printing or PDF-creation is an essential element to preclude alteration or unintended use.
Workflow Workflow is a complex process and some document management systems have a built-in workflow module. There are different types of workflow. Usage depends on the environment to which the electronic document management system (EDMS) is applied. Manual workflow requires a user to view the document and decide whom to send it to. Rules-based workflow allows an administrator to create a rule that dictates the flow of the document through an organization: for instance, an invoice passes through an approval process and then is routed to the accounts-payable department. Dynamic rules allow for branches to be created in a workflow process. A simple example would be to enter an invoice amount and if the amount is lower than a certain set amount, it follows different routes through the organization. Advanced workflow mechanisms can manipulate content or signal external processes while these rules are in effect.
Collaboration Collaboration should be inherent in an EDMS. In its basic form, a collaborative EDMS should allow documents to be retrieved and worked on by an authorized user. Access should be blocked to other users while work is being performed on the document. Other advanced forms of collaboration allow multiple users to view and modify (or markup) a document at the same time in a collaboration session. The resulting document should be viewable in its final shape, while also storing the markups done by each individual user during the collaboration session.
Versioning Versioning is a process by which documents are checked in or out of the document management system, allowing users to retrieve previous versions and to continue work from a selected point. Versioning is useful for documents that change over time and require updating, but it may be necessary to go back to or reference a previous copy.
Searching Searching finds documents and folders using template attributes or full text search. Documents can be searched using various attributes and document content.
Publishing Publishing a document involves the procedures of proofreading, peer or public reviewing, authorizing, printing and approving etc. Those steps ensure prudence and logical thinking. Any careless handling may result in the inaccuracy of the document and therefore mislead or upset its users and readers. In law regulated industries, some of the procedures have to be completed as evidenced by their corresponding signatures and the date(s) on which the document was signed. Refer to the ISO divisions of ICS 01.140.40 and 35.240.30 for further information.[3][4] The published document should be in a format that is not easily altered without a specific knowledge or tools, and yet it is read-only or portable.[5]
Reproduction Document/image reproduction is key when thinking about implementing a system. It's great to be able to put things in, but how are you going to get them out? An example of this is building plans. How will plans be scanned and scale be retained when printed?

Standardization

Many industry associations publish their own lists of particular document control standards that are used in their particular field. Following is a list of some of the relevant ISO documents. Divisions ICS 01.140.10 and 01.140.20.[6][7] The ISO has also published a series of standards regarding the technical documentation, covered by the division of 01.110.[8]
  • ISO 2709 Information and documentation — Format for information exchange
  • ISO 15836 Information and documentation — The Dublin Core metadata element set
  • ISO 15489 Information and documentation — Records management
  • ISO 21127 Information and documentation — A reference ontology for the interchange of cultural heritage information
  • ISO 23950 Information and documentation — Information retrieval (Z39.50) — Application service definition and protocol specification
  • ISO 10244 Document management — Business process baselining and analysis
  • ISO 32000 Document management — Portable document format

Document control

Government regulations require that companies working in certain industries control their documents. These industries include accounting (e.g., 8th EU Directive, Sarbanes–Oxley Act), food safety (e.g., Food Safety Modernization Act), ISO (mentioned above), medical device manufacturing (FDA), manufacture of blood, human cells, and tissue products (FDA), Healthcare (JCAHO), and Information technology (ITIL).[9][10]
Documents stored in a document management system--documents such as procedures, work instructions, and policy statements--provide evidence of documents under control. Failing to comply could cause fines, the loss of business, or damage to a business's reputation.
When working in an environment that requires document control, the following procedures are useful to document:
  • Reviewing and approving documents prior to release
  • Reviews and approvals
  • Ensuring changes and revisions are clearly identified
  • Ensuring that relevant versions of applicable documents are available at their “points of use”
  • Ensuring that documents remain legible and identifiable
  • Ensuring that external documents like customer supplied documents or supplier manuals are identified and controlled
  • Preventing “unintended” use of obsolete documents

See also

References

  1. ^ Policy Management System
  2. ^ Stemming: Making searching easier
  3. ^ International Organization for Standardization. "01.140.40: Publishing". Retrieved 14 July 2008.
  4. ^ International Organization for Standardization. "35.240.30: IT applications in information, documentation and publishing". Retrieved 14 July 2008.
  5. ^ OnSphere Corporation. "SOP Document Management in a Validated Environments". Retrieved 25 April 2011.
  6. ^ International Organization for Standardization. "01.140.10: Writing and transliteration". Retrieved 14 July 2008.
  7. ^ International Organization for Standardization. "01.140.20: Information sciences". Retrieved 14 July 2008.
  8. ^ International Organization for Standardization. "01.110: Technical product documentation". Retrieved 15 July 2008.
  9. ^ Anderson, Chris. Is Document Control Really That Important?, Bizmanualz, December 17th, 2010.
  10. ^ "Code of Federal Regulations Title 21, Part 1271". Food and Drug Administration. Retrieved 31 January 2012.